tsduck: Continuity Errors Count skyrocketing when two TSP instances with same multicast address output (but different ports) are running on same network

Hi @lelegard

I thought I opened a bug ticket mistakenly earlier on today, therefore I closed it. But it seems I actually have found a bug, so I am re-opening one. Please see below the explanations.

Bug description:

Continuity Errors Count skyrocketing when two TSP instances with same multicast address output (but different ports) are running on same network

  • When two TSP instances with the same multicast output (but different ports obviously) are running on the same network, the number of Continuity Errors are skyrocketing. We will call these instances TSP1 and TSP2.
  • If only TS1 is launched, not a single CC error is detected.
  • Likewise, if only TS2 is launched, not a single CC error is detected either.
  • However, as soon as both TSP1 and TSP2 are running at the same time, the Continuity Errors Count skyrocket like crazy, from the very first second the two TSP process run concurrently.

How to reproduce:

Equipment involved to easily reproduce the problem:

  • One MPEG2-TS playout/streamer, aka “sender”, with one physical Ethernet ports used as TS output. We will call it NIC1.
  • One Linux server with two different physical Ethernet ports used as TS inputs/outputs. We will call them NIC1 and NIC2. Connection will involve a switch: Playout-NIC1 to Switch. Switch to TSDuck-NIC1, and Switch to TSDuck-NIC2.

TSP1

tsp --verbose --bitrate 50000000 --initial-input-packets 7 --max-input-packets 7 --max-flushed-packets 7 -I ip 225.1.1.1:11111 -l 10.168.1.153 -O ip 239.1.1.1:4001 -l 10.168.1.153 --packet-burst 7 --enforce-burst

TSP2

tsp --verbose --bitrate 50000000 --initial-input-packets 7 --max-input-packets 7 --max-flushed-packets 7 -I ip 225.1.1.1:11111 -l 10.168.1.154 -O ip 239.1.1.1:4002 -l 10.168.1.154 --packet-burst 7 --enforce-burst

Edit1: I have located the bug. It seems it’s related to the same multicast address output, in my case 239.1.1.1. If for one of the TSP plugin, I’m using a different multicast output address, the number of CC errors will not be rising anymore and will simply be = 0. Example: TSP1 with 239.1.1.1 and TSP2 with 239.1.1.2 works.

Expected behavior:

The number of CC errors must remain = 0, even if two instances of TSP are running at the same time on the same network, with the same multicast output address (but obviously a different port).

Errors and logs:

No error with the above command. But using Dektec, TSReader and TSDuck with continuity plugin will show a rising CC errors, as follows:

* continuity: packet index: 31,167, PID: 0x026C, missing 7 packets
* continuity: packet index: 31,199, PID: 0x026C, missing 10 packets
* continuity: packet index: 31,200, PID: 0x02D0, missing 9 packets
* continuity: packet index: 31,220, PID: 0x026C, missing 12 packets
* continuity: packet index: 31,221, PID: 0x02D0, missing 13 packets
* continuity: packet index: 31,248, PID: 0x02D0, missing 5 packets
* continuity: packet index: 31,249, PID: 0x026C, missing 6 packets
* continuity: packet index: 31,291, PID: 0x026C, missing 6 packets
* continuity: packet index: 31,292, PID: 0x02D0, missing 6 packets
* continuity: packet index: 31,346, PID: 0x026C, missing 15 packets
* continuity: packet index: 31,402, PID: 0x026C, missing 6 packets
* continuity: packet index: 31,403, PID: 0x02D0, missing 7 packets
* continuity: packet index: 31,437, PID: 0x026C, missing 9 packets
* continuity: packet index: 31,438, PID: 0x02D0, missing 9 packets
* continuity: packet index: 31,459, PID: 0x026C, missing 13 packets
* continuity: packet index: 31,460, PID: 0x02D0, missing 13 packets
* continuity: packet index: 31,472, PID: 0x02D0, missing 12 packets

Environment:

  • OS: Linux
  • OS version: CentOS 7.9
  • TSDuck full version: 3.29-2627
  • Installation type: Official binary

Additional information:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (8 by maintainers)

Most upvoted comments

I’m sure you already have, but I found these two interesting links on Google which should help you find out how to use bind() for our matter

They just repeat what I already mentioned and what is already implemented in TSDuck: on Linux, the socket shall be bound to the multicast address, not the local address. And, if needed, the join group is sent on the local interface.

The first link also mentions:

Having multiple multicast groups that use the same port is a suboptimal design for the exact situation you’ve found yourself in: it makes it hard to filter the traffic to just what you want if another application on the same machine is also listening to traffic that shares the same port. It’s generally best to avoid sharing either multicast group or port across different logical channels.

Which is what you did and what we advised not to do…