Friday, August 12, 2011

Using Wireshark to Troubleshoot VoIP

Wireshark (also known as ethereal or tethereal) is a "network protocol analyzer. It lets you capture and interactively browse the traffic running on a computer network." If you want a definitive answer on what is causing problems with your VoIP calls, you need to learn how to use Wireshark. Thankfully, it's pretty simple for anyone with basic knowledge of networking.

There are two parts to successfully running a network trace: capturing the traffic and viewing the results. I'll go through capturing traffic on different operating systems and viewing the results on a Windows PC using Wireshark.

Capturing traffic on a Linux-based PBX

Assuming you're running CentOS, which is the default for most PBX software, you will need to install the Wireshark package:

yum install wireshark

Run the following commands on your PBX to capture all traffic, SIP (signalling) and RTP (audio), between the PBX and your provider's server into file /root/my.cap. While the capture is running, restart your PBX software so it will attempt to register with your provider. Then attempt an outgoing phone call and an incoming phone call. Type Ctrl-C after running these tests to stop the capture.

tshark host server.provider.com -w /root/my.cap

Run the following command to compress your capture file my.cap into a compressed my.cap.gz file:

gzip /root/my.cap

From your Windows PC, run the following to copy that capture file from your PBX's IP address to your Windows PC's C: drive. You will need to have the pscp program, which is available here. You will be prompted for the PBX's root password. Replace 192.168.1.2 with the IP address of your PBX.

pscp root@192.168.1.2:/root/my.cap.gz c:\

Your capture file is now ready to be viewed from your Windows PC. Run Wireshark and open the capture file (Wireshark can open compressed capture files).

Capturing traffic on a Windows PC

If you have a Windows PC on the same network as your Voice over IP adapter, you can use it to capture all traffic between the adapter and your provider. Since a network switch isolates the traffic on each port, you cannot capture another device's traffic from your PC (unless it supports port mirroring). Instead, the easiest way to capture traffic on your local network is to get a cheap Ethernet hub. A hub repeats all traffic coming into one port on all of the other ports. This means any device plugged into a hub can view the traffic to/from every other device on the same hub.

Plug your hub into your router and your PC and adapter into the hub, as shown below:

Tip: A more advanced method of capturing traffic would be to plug the hub into your cable modem, so it sits between the modem and router. However, anything else you plug into the hub (like the PC running Wireshark) will be unprotected and open to all traffic from the Internet.

Start Wireshark and click on Capture Options. This will allow us to select the proper network card to capture traffic from, the host to capture traffic to/from (your provider's gateway) and a place to save the capture file. You must run the capture in "promiscuous" mode, which means Wireshark will "look" at any traffic it sees on the network, not just traffic to/from the host it's running on (your PC).

Screenshot_1
While the capture is running, reboot your adapter so it will attempt to re-register with your provider. Also, attempt to make an incoming and outgoing call. After running these tests, click on the "Stop the running live capture" button. Your capture will already be loaded for viewing.

Reading and understanding a VoIP traffic capture

Now that you've captured the traffic between your adapter (or PBX) and your provider, you have enough information to figure out what is causing the problem. For most people, the easiest thing to do now is send the capture file to your provider. An experienced technical support agent will be able to immediately tell you what is causing the problem and how to fix it. As I mentioned in an earlier post (VoIP, no dial tone, missed calls & port forwarding), there's a 90% chance the the problem is your router.

Tip: Unless you have significant networking experience, stop here and send the capture to your provider. It will be tremendously helpful to them in troubleshooting the issue and you can save yourself a few hours of time.

If you'd like to dig into the details and see exactly what is happening, then continue on. When you open your capture file, you'll see a split-screen with each Ethernet frame on the top, a user-readable drill-down of the contents in the middle, and the raw data on the bottom.

A working SIP REGISTER conversation

Scroll down in the top panel until you reach the first REGISTER attempt from your adapter to your provider. Select the frame and in the middle window, right-click on the "Session Initiation Protocol" section and pick "Expand Subtrees". Find the line starting with "Call-ID", right-click on it, and select Apply as Filter > Selected. The Filter box above the first window pane will now show something like "sip.Call-ID == "5b1c8fdb-90c4f2a3@10.0.1.100"", meaning only the frames that contain that Call-ID will be shown. This makes it easier to filter out one conversation from a capture that may have a lot of extraneous traffic. This is what a proper REGISTER attempt will look like:

2

Your adapter will attempt to register, but it doesn't send any Authorization data the first time. The provider will respond with a 401 Unauthorized and will include information such as the realm, nonce, algorithm values. The adapter will then use these values, along with your SIP password, to form an encrypted response which it will use when re-registering. Since the second register attempt has Authorization information, the provider will accept the registration and reply with a 200 OK.

You can dig into each SIP request and response to see exactly how the registration conversation works. Now that you know what a working REGISTER conversation looks like, you can identify one that is broken.

A broken SIP REGISTER conversation

The most common cause for an adapter not being able to register is a router/firewall issue. If the provider's 401 Unauthorized response never makes it back to the device, it can never properly send Authorization information to the provider to register.

3

As you can see in the registration attempt above, the adapter never gets the 401 Unauthorized, so it keeps trying without success. If your provider was looking at a trace from their end, they'd see a 401 Unauthorized being sent, but no REGISTER with an Authorization section, clearly indicating that the 401 was being blocked by the router/firewall.

Working outgoing & incoming calls

Most providers configure their adapters to only play a dial-tone when the adapter is registered. Therefore, once you are registered, you will probably have no issues making an outbound call. Also, an outbound call is a connection initiated from your private network, so there are usually no firewall issues to deal with either.

An outgoing call is initiated by an INVITE request sent from your adapter to the provider. The provider will reply with a 100 Trying and a 407 Proxy Authentication Required. This is similar to the 401 Unauthorized during the REGISTER request, except with a Proxy-Authenticate section instead of an Authorization section. Your adapter will re-send the INVITE with your credentials. The provider will respond with a 100 Trying, 183 Session Progress (which will cause you to hear ringing), and 200 OK when the call is answered. Your adapter will acknowledge the answer and the conversation will begin. Finally, when one of the sides hangs up, a few BYE messages will be sent and acknowledged.

Tip: You may not see this exact sequence of events, depending on how your provider handles the call and if it's actually answered. A busy or invalid number will return different responses than a 200 OK.

A working incoming call looks similar to an outgoing call, with INVITEs, 100s, 183s, and 200s.

Broken outgoing & incoming call

It's rare to have an issue with outgoing calls if you have a dial-tone. However, a capture will show what's going wrong if you do.

If you're not receiving incoming calls at all, chances are you won't see anything on the capture, which is still useful information! A common problem is that you will receive incoming calls for a few minutes after your adapter is first powered on (and registers), but then your incoming calls don't ring through. If you can capture this entire sequence (register, successful incoming call, failed incoming call 15 minutes later), it will clearly demonstrate that there is a router/firewall issue.

Analyzing calls with Wireshark

Wireshark has telephony-specific features that may come in handy for troubleshooting VoIP calls. After opening your capture file, go to Telephony > VoIP Calls. Wireshark will automatically detect all of the calls in your capture.

You can then click "Prepare Filter" to easily view just the frames associated with a particular call, click "Flow" to see the conversation between your adapter and provider, or click "Player" to listen to one or both sides of the conversation in the capture.

Also, you can go to Telephony > RTP > Show All Streams. Select an RTP stream (the audio from one side of a phone call), click Find Reverse, click on Analyze. You can now see and graph statistics like how many RTP packets were lost or the max and mean jitter.

Finally, if you want to cut/paste a UDP SIP conversation into a support ticket with your provider, you can right-click on one of the frames in the conversation and select Follow UDP Stream. Change the radio button to ASCII before copying or saving the data.

Conclusion

Now that you know how to capture and analyze your Voice over IP traffic using Wireshark, you have the ability to do some troubleshooting on your own that goes beyond rebooting the adapter. However, as I've mentioned before, sometimes it's far easier and faster to let your provider take over after you've provided them with a helpful capture showing the problem.

Please don't take this as an opportunity to hassle your provider over issues that don't cause any noticeable problems in your phone calls. There will be jitter; there will be dropped RTP packets. That is the nature of voice over IP and there are numerous methods of ensuring voice quality despite these network-level problems. Your provider is focused on providing you with clear phone calls, not on optimizing the statistics generated by Wireshark.

If you're interested in the inner-workings of voice over IP and SIP, there's no better way to dig in than running a Wireshark capture and figuring out exactly what is going on.