mediasoup: consuming plain producer results in worker crash
I’m pushing video to mediasoup(3.11.4) producer with gstremer. There is no problems with it if there is no consumers for the producer. But after about 80 seconds of consuming (video play perfectly whole time) mediasoup worker process dies with failed assertion 'this->buffer.size() <= MaxSeq': StorageItemBuffer contains more than 65535 entries
Core dump:
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
set = {__val = {0, 94619480744673, 28202962353000, 56, 94619497407008, 94619497407720, 140736458774320, 94619497345440, 140283140088888, 94619480744716, 94619497031937, 28202962353000, 56,
0, 18446744073709551615, 18446744073709551615}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#1 0x00007f96356aa7f1 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {0, 0, 511101108348, 395136991342, 140283121853312, 94619497341264, 512, 140283121851456,
4294934528, 94619569736784, 32769, 32705, 140283118359213, 390842024046, 140283121854080, 140283121836704}}, sa_flags = 896436269, sa_restorer = 0x560e55d448e0}
sigs = {__val = {32, 0 <repeats 15 times>}}
__cnt = <optimized out>
__set = <optimized out>
__cnt = <optimized out>
__set = <optimized out>
#2 0x0000560e50497752 in RTC::RtpStreamSend::StorageItemBuffer::Insert(unsigned short, RTC::RtpStreamSend::StorageItem*) ()
No symbol table info available.
#3 0x0000560e50497b1e in RTC::RtpStreamSend::StorePacket(RTC::RtpPacket*, std::shared_ptr<RTC::RtpPacket>&) ()
No symbol table info available.
#4 0x0000560e50497e1a in RTC::RtpStreamSend::ReceivePacket(RTC::RtpPacket*, std::shared_ptr<RTC::RtpPacket>&) ()
No symbol table info available.
#5 0x0000560e504a9157 in RTC::SimpleConsumer::SendRtpPacket(RTC::RtpPacket*, std::shared_ptr<RTC::RtpPacket>&) ()
No symbol table info available.
#6 0x0000560e50479b3b in RTC::Router::OnTransportProducerRtpPacketReceived(RTC::Transport*, RTC::Producer*, RTC::RtpPacket*) ()
No symbol table info available.
#7 0x0000560e50465f33 in RTC::Producer::ReceiveRtpPacket(RTC::RtpPacket*) ()
No symbol table info available.
#8 0x0000560e504c0a91 in RTC::Transport::ReceiveRtpPacket(RTC::RtpPacket*) ()
No symbol table info available.
#9 0x0000560e50451ffb in RTC::PlainTransport::OnRtpDataReceived(RTC::TransportTuple*, unsigned char const*, unsigned long) ()
No symbol table info available.
#10 0x0000560e5045316e in non-virtual thunk to RTC::PlainTransport::OnUdpSocketPacketReceived(RTC::UdpSocket*, unsigned char const*, unsigned long, sockaddr const*) ()
No symbol table info available.
#11 0x0000560e50820cd3 in uv.udp_recvmmsg ()
No symbol table info available.
#12 0x0000560e50821a43 in uv.udp_io ()
No symbol table info available.
#13 0x0000560e50825286 in uv.io_poll ()
No symbol table info available.
#14 0x0000560e50817a06 in uv_run ()
No symbol table info available.
#15 0x0000560e503acdb9 in DepLibUV::RunLoop() ()
No symbol table info available.
#16 0x0000560e503ba751 in Worker::Worker(Channel::ChannelSocket*, PayloadChannel::PayloadChannelSocket*) ()
No symbol table info available.
#17 0x0000560e503ab2e6 in mediasoup_worker_run ()
No symbol table info available.
#18 0x0000560e503a9e74 in main ()
No symbol table info available.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 48 (30 by maintainers)
Commits related to this issue
- Add broadcasters/gstreamer-with-http-video.sh b/broadcasters/gstreamer-with-http-video.sh to reproduce crash So we can reproduce this crash in mediasoup https://github.com/versatica/mediasoup/issues/... — committed to versatica/mediasoup-demo by ibc a year ago
- Fix crash in RtpStreamSend due to this->buffer.size() > MaxSeq - Fixes #975 - For now just some logs added to diagnose the problem. — committed to versatica/mediasoup by ibc a year ago
In some cases there is just no way to work around broken clients in a sane way. With more complex logic we can shorten the max buffer size to less than 2^16-1, but I think that should be a follow-up optimization.
I believe that there is a consensus to make it not crash first and optimize later.
Honestly I don’t think this is the way to go. We must not assume frame max size. The encoder may need to send many packets with same timestamp due to huge video frames. If the client is buggy (such as in this issue AFAIU) then that’s its problem. What we have to fix is the Insert() method which is the one that should never store more than MaxSeq packets.
idx <= static_cast<uint16_t>(this->buffer.size() - 1)ensures this in one of the branches, it’ll simply start overriding old values once we reach buffer size of elements.The last
elsebranch is where this issue probably happens. I guess there was incorrect expectation about sequence numbers there and some packets do get processed out of order enough for sequence number to overflow and become belowthis->startSeq. I think similar check for buffer size needs to be added here and start overriding values instead of pushing beyond buffer size.It will result in messed up contents of packets when decoder tries to interpret it, but we for sure must not crash here.
If it can help we have video that does not crash worker even without
do-timestamp. https://codeda.com/data/videoRecordH264.mp4 You just need to replaceurlparam in the demoIt seemed to help on the first try!! We will keep testing and keep you updated. Thanks!
Ah i understand, it’s RTP packet timestamps which are all the same, but codec packets’ pts within them are correct so the stream is playable. We will look into it on our side thanks.
Please @angedonik, do this change too. TS in strict mode does not allow octal literals.
BTW: I have it working, waiting for it to crash.
Done
With this demo the crash happens every single time: https://github.com/angedonik/plain-producer-demo