pimd: Multicast routes don't always get their iif updated based on unicast routing changes

The network setup we have is:

S+Rn_.y = multicast source and receiver n, 
                    where .y is the least significant octet of the IPv4 address

Cisco_n = Cisco PIM Router
pimd_n = pimd PIM Router (pimd v2.3.2)

The RP is the router at the top of the picture.

              Cisco_1 --------------------  <BGP>
                 |                        |
S+R1_.165 -------| <ospf>                 |
                 |                     Cisco_2 ------ S+R2_.46
               Cisco_3                    |
                 |                        | <BGP>
S+R3_.117 -------| <ospf>                 |
                 |                        |
                 | <eth0>                 |
               pimd_4                     |
                 | <ndl0>                 |
                 |                        |
                 | <ospf>                 |
                 |                        |
                 | <ndl0>                 |
               pimd_5                     |
                 | <eth0>                 |
                 |                        |
                 | <ospf>                 |
S+R4_.101 -------|------- Cisco_6         |
                 |           |            |
                 |           | <ospf>     |
                 |           |            |
               pimd_7      Cisco_8        |
                 |           |            |
           <ospf>|        S+R5_.85        |
                 |                        |
               pimd_9                     |
                 |                        |
                 |<ospf>                  |
                 |                        |
              Cisco_10 --------------------

Reachability via OSPF is preferred.
The BGP link between Cisco_10 and Cisco_2 is preferred for all pimd_n routers.
The Cisco_1 router is the only router that prefers the direct BGP route to the S+R2_.46 network.

We have intermittent failure when the link between pimd routers is restored. i.e. If the link pimd_4 and pimd_5 has been down long enough for the unicast routing to converge when it is restored the multicast routing doesn’t always converge (fails approx. 50% of the time). Specifically the multicast receiver S+R3_.117 might not see traffic from sources S+R4_.101 and / or S+R5_.85 and / or source S+R2_.46. The combination of which sources are lost from which receiver varies in the current setup.

Tracking through debug logs it appears that the change in the unicast routing is not always picked up.

The attached logs are from pimd_4 and pimd_5.

CoolSq_10.200.55.119_20160615-160927_pimd_4.txt SqCool_10.200.55.106_20160615-161131_pimd_5.txt

Events of interest are on the pimd_4 router log (I’ve bolded the timestamps of the symptoms):

  • 23:13:58 (approx) link is restored
  • 23:13:59.034 : first report of multicast packet being received over the wrong iif after the link restored.
  • 23:13:59.232 : PIM neigbour restored with pimd_5
  • 23:13:59.471 : first join/prune received from pimd_5
  • 23:14:20.423 : it looks like the unicast_routing_timer has triggered and the various group routes are being tested. Both the sources .85 and .101 return a rpf of vif 1, but you can see by comparing the previous and next summaries that .85 has changed Incoming : I.. to Incoming : I.. while .101 remains Incoming : I..
  • 23:14:40 : it looks like the source .101 has the incoming interface updated correctly but there are still subsequent reports of the multcast coming in on the wrong interface 23:14:57.732 Wrong iif: src 10.200.55.101, dst 239.0.4.1, iif 1

It is as if the “change_interfaces” call on line 641 of timer.c isn’t actually causing the change to be applied to the group, maybe it needs the force change flag set?

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 42 (23 by maintainers)

Commits related to this issue

Most upvoted comments

I think that testing is now complete and successful for this specific use-case (I added a debug log at line 260 in place of the deletion and that path is definitely executed during testing so the change in behaviour appears to be necessary for the fix. The patch #110 log is not appearing during testing but I did some due diligence review of the code and agree that the code makes no sense at all without it. (It looks like a cut-paste and modify error based on line 412/413 so the pattern should be save existing route (lines 646,647) call set_incoming to determine new incoming/upstream (line 648) test if set_incoming has changed things and then update (lines 662,663 - pre patch - this is checking the wrong field for changes)

NB Lines 666 and 667 almost certainly hide the logic error and cause the correct behaviour to occur on the next pass through age_routes so it is arguably not necessary to have the patch in order to fix this issue.

For our use-case of very, very stable sparse group membership - basically every element joins once and then stays a member until it reboots, this fix appears to work and is only executed on network topology change and appears to have no side-effects. Not sure what regression tests would be needed to verify other use-cases where members/sources are more dynamic (the intent of the original code appears to be to clean up kernel caching for sources that are no longer in the group)

I tried to reproduce the problem but as I do not have Cisco box for RP I had to use pimd for RP as well. Unfortunately there seems to be some issues in this kind of scenario in pimd RP functionality (but that is an another story).

Anyway by doing following <two changes, link break/remake looked like working.

  • First it still looks to me that patch #110 is mandatory.
  • Second, in timer.c function check_spt_threshold removes idle kernel cache entry but leaves? pimd state to previous state. So, I suggest that you try to remove line 260 delete_single_kernel_cache(mrt, kc);

you can also use command “ip mroute show all” to see how does multicast routing table look like from kernel point of view. If that is very different compared to pimd view of active routes, then situation is not good.

There was interesting log entry in every 3.30minutes: “S,G entry changed (timeout) causing interface switch at iif 1”. I suppose you have added that entry to age_routes, but as it keeps coming over and over again, change does not make affect for some reason. Could you add diff or something to show what are your changes and in what places they are?

@troglobit, sorry for not replying with feedback to the forum on @idismmxiv patch. Tried this one last year but the test system didn’t show an improvement, moreover, it had created more “noise”. I’ll clarify the result detail with my QA and would attempt to provide a better comment. Bottom line - it didn’t solve my issue.