SIP is an application layer control protocol used for signaling in VoIP. It is the most widely used and supported protocol in VoIP today. Due to it being an open protocol. It is supported on a wide array of commercially available devices like the Linksys PAP2, Cisco Phones, and many, if not all, IP PBXs. The protocol is used to create two party, multiparty, or multicast sessions, and is independent of the transport layer, meaning that it can use TCP, UDP, SCTP, ATM, etc. for signaling, and is both IPv4 and IPv6 compatible. It is also a text based signaling protocol, using UTF-8 encoding. This allows for human readable SIP messages.
SIP typically operates on the default port of 5060, and connects servers with clients and other endpoints. It is voice, and video, and data compatible. Throughout its development, SIP has allowed for delivery of many of the advanced call processing features of SS7, including ANI, CPN, and DNIS delivery.
SIP supports five parts of establishing and terminating communications:
1. User Location
2. User Availability
3. User Capabilities
4. Session Setup: establishment of call parameters at both called and caller ends.
5. Session Management: transfer and termination, modifying session parameters, and invoking services.
SIP communicates via messages. These messages are typically communicated via headers, not unlike http.
Exploiting SIP vulnerability
The first thing you are going to do is run wireshark on a computer on the same subnet as the computer that will hold the conversation. If it is easier for you, this can be on the computer that is going to be running the softphone, or any other computer that can sniff the packets. For myself I found it easier to set up my laptop to monitor the conversation, but that’s just me.
Since I chose to monitor the conversation from another computer/IP, I am going to need to use some trick to ensure that I get both sides of the conversation. For this I will implement an ARP cache Poisoning attack. This is accomplished using Arpspoof. I simply opened a terminal and typed:
arpspoof –i ath0 192.168.1.1 (this is because the host i.e. router is at 192.168.1.1)
I will write more on ARP Spoofing on future posts.
Now, fire up Wireshark, and begin to capture packets on the interface that is connected to the network. In my case that was my Wi-Fi card: ath0. Next, initiate a call with the softphone we configured in the last tutorial, and have a lovely conversation with whomever. Stop logging packets after the conversation is complete, and take a look at what you’ve got. In wireshark, click on Analyze -> Decode As, and select SIP from the list, then click Apply. Go back to the list of captured packets, sort them by protocol, and highlight a packet that reads something like:
“RTP type=ITU-T G.711 PCMU, SSRC=blah blah blah”.
Next click on Statistics -> RTP -> Show All Streams. This will show you all of the RTP streams that you captured. One will be the Forward stream, and one will be the Reverse. You can usually tell because of the IP addresses, but there is also a “find the reverse stream” button. Click on the forward stream, shift + click on the reverse, and click on the Analyze button.
Another window will pop up, and will have a button on the bottom left labeled “Save Stream”… or something very similar. Click on it, select .au, name your file and save. You can use Audacity to convert the .au file into a .wav or .mp3.
In Part-2 of this tutorial we will explore brute forcing authentication, DoS attacks, and injecting Audio into ongoing conversations using RTP packet injection techniques.