Great to hear that the problem has been resolved. It’s often stressful when you’ve got a production network not working and you’re trying to troubleshoot. Now that everything is OK, you have the time to go over issues calmly and cooly to learn for the future!
Anyway, now that I’ve said that, here goes:
Troubleshooting multicast packet loss should not be different than troubleshooting packet loss in general. Multicast doesn’t inherently add any particular characteristics to packet loss that would not be present under other circumstances. Both the Mstat and the Mtrace utilities are useful for troubleshooting multicast routing failures and to examine the behavior of a multicast topology including statistics and choice of path, however, they don’t deal with all types of packet loss. Only packet loss that may be due to faulty multicast routing.
Unless your packet loss is due exclusively to faulty multicast routing, packet loss is typically due to other issues independent of multicast, that have to do with congestion, corrupted packets, MTU issues, or misconfigurations in ACLs route maps, or other policies.
Indeed, Wireshark is not so helpful here because of the reasons you give, but also because of the fact that you need to know where to collect packets on the network. You have to first zero in on the problem and then choose what packets to capture.
However, for voice applications including protocols such as SIP, RTP etc, Wireshark can be VERY helpful because it has a great telephony utility where you can follow a whole conversation and see the SIP exchanges and the voice packets of a particular conversation. It analyzes in detail these exchanges In such a case, you can examine the quality of the voice conversation. All of these tools are found under the “Telephony” menu option in the application.
I have a question regards to how a receiver can join a mcast group during register suppression window.
Lets assume a publisher is trying to register the source with RP and currently no active recipients are there.
So RP sends a register STOP message to source and a register suppression timer starts.
During this period, if an active receiver sends a PIM join message for that group to RP, how long the receiver need to wait to receive the MAST stream? Does it need to wait till the end of register suppression timer window ? Does Source sits idle til the timer expiry before it cant start sending the stream?
This is an excellent question. If a subscriber sends a join message to its local router, this will be forwarded as expected to the RP. If a PIM Register-Stop message had previously been sent to the first-hop router (the router connected to the multicast source), the RP will typically wait for the first-hop router to send a new PIM Register message. This is assuming a strict adherence to the “rules” of this operation. This means that the host wanting to join will have to wait for this timer to elapse so that the first-hop router will start sending multicast traffic to the RP once again.
However, as you have suggested in your post, this seems inefficient. So I tried labbing this up, and found that if a join message reaches the RP during that timer, the RP will automatically send a join message to the first hop router, informing it that it wants to receive traffic. The first hop router in turn will ignore its timer and will begin sending multicast traffic.
This is highly vendor and platform specific, and it may be that some IOS versions or other vendors don’t do this while others do. It is likely that most modern implementations do this to avoid the inefficiencies involved.
i work in a low latency trading environment where multicast is heavily used.I was giving a presentation on PIM JOIN and REGISTER and one of the dev guys asked this question and i was clueless for some time.I will get back to him with this explanation