webrtc: datachannel hangs with packet loss
Your environment.
- Version: 2.2.17
- Browser: Linux Chrome 83 and iOS Safari 12
- Other Information -
What did you do?
This is a pion program that receives microphone RTP audio from a web browser and sends back packets on a datachannel. (I tried to reproduce this with a simple example but it does not occur, something is different code attached below).
I’m trying to simulate a datachannel issue seen on poor wifi. On Linux, simulate data loss with:
tc qdisc add dev lo root handle 1:0 netem loss 2.5%
(alternatively, same happens with iPad on wifi)
At some point,usually in first 20 seconds, web client stops calling data channel packet receive handler. Pion log shows:
sctp DEBUG: 18:06:07.296561 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=1 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:09.296765 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=2 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:13.297110 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=3 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:21.297200 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=4 cwnd=1228 ssthresh=4912
(these messages repeat...)
There is no indication that the datachannel is hung - the ICE state is connected, the RTP microphone audio is still being transmitted by the browser and received by pion, and the datachannel state has not changed.
What did you expect?
Datachannel should be stable on poor wifi
What happened?
Datachannel hangs and never recovers
Full log PION_LOG_DEBUG=all
rx offer from client-1eddf0cb-6120-4976-af11-c2076cee14c2
pc INFO: 2020/06/22 18:06:03 signaling state changed to have-remote-offer
pc INFO: 2020/06/22 18:06:03 signaling state changed to stable
ice DEBUG: 18:06:03.615143 agent.go:595: Started agent: isControlling? false, remoteUfrag: "lv+g", remotePwd: "aQnIP1LuW/bbwIOg1Ng5WVKt"
ice INFO: 2020/06/22 18:06:03 Setting new connection state: Checking
pc INFO: 2020/06/22 18:06:03 ICE connection state changed: checking
ICE state changed: checking
ice INFO: 2020/06/22 18:06:03 Setting new connection state: Connected
pc INFO: 2020/06/22 18:06:03 ICE connection state changed: connected
ICE state changed: connected
pc INFO: 2020/06/22 18:06:03 peer connection state changed: connected
sctp DEBUG: 18:06:03.722797 association.go:778: [0xc00011e340] state change: 'Closed' => 'CookieWait'
sctp DEBUG: 18:06:03.722812 association.go:323: [0xc00011e340] sending INIT
sctp DEBUG: 18:06:03.722831 association.go:414: [0xc00011e340] readLoop entered
sctp DEBUG: 18:06:03.722851 association.go:438: [0xc00011e340] writeLoop entered
sctp DEBUG: 18:06:03.723045 association.go:813: [0xc00011e340] chunkInit received in state 'CookieWait'
sctp DEBUG: 18:06:03.723122 association.go:875: [0xc00011e340] chunkInitAck received in state 'CookieWait'
sctp DEBUG: 18:06:03.723136 association.go:897: [0xc00011e340] initial rwnd=131072
sctp DEBUG: 18:06:03.723146 association.go:349: [0xc00011e340] sending COOKIE-ECHO
sctp DEBUG: 18:06:03.723156 association.go:778: [0xc00011e340] state change: 'CookieWait' => 'CookieEchoed'
sctp DEBUG: 18:06:03.723350 association.go:965: [0xc00011e340] COOKIE-ECHO received in state 'CookieEchoed'
sctp DEBUG: 18:06:03.723360 association.go:778: [0xc00011e340] state change: 'CookieEchoed' => 'Established'
sctp DEBUG: 18:06:03.723373 association.go:1146: [0xc00011e340] accepted a new stream (streamIdentifier: 1)
sctp DEBUG: 18:06:03.723383 stream.go:118: [1:0xc00011e340] reassemblyQueue readable=true
sctp DEBUG: 18:06:03.723389 stream.go:120: [1:0xc00011e340] readNotifier.signal()
sctp DEBUG: 18:06:03.723394 stream.go:122: [1:0xc00011e340] readNotifier.signal() done
ortc WARNING: 2020/06/22 18:06:03 webrtc.DetachDataChannels() enabled but didn't Detach, call Detach from OnOpen
sctp DEBUG: 18:06:03.723564 association.go:995: [0xc00011e340] COOKIE-ACK received in state 'Established'
pc DEBUG: 18:06:03.748571 peerconnection.go:280: got new track: &{mu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:0 readerWait:0} id:ecc9ba16-9b43-4dce-8816-cb7c60a935b8 payloadType:111 kind:1 label:- ssrc:3263351828 codec:0xc000179a40 packetizer:<nil> receiver:0xc000136c00 activeSenders:[] totalSenderCount:0}
onTrack type=111 codec=opus
sctp DEBUG: 18:06:03.923549 association.go:650: [0xc00011e340] sending SACK: SACK cumTsnAck=2488767936 arwnd=1048576 dupTsn=[]
sctp DEBUG: 18:06:07.296561 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=1 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:09.296765 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=2 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:13.297110 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=3 cwnd=1228 ssthresh=4912
sctp DEBUG: 18:06:21.297200 association.go:2113: [0xc00011e340] T3-rtx timed out: nRtos=4 cwnd=1228 ssthresh=4912
Another log
ice INFO: 2020/06/22 16:50:57 Setting new connection state: Connected
pc INFO: 2020/06/22 16:50:57 ICE connection state changed: connected
ICE state changed: connected
pc INFO: 2020/06/22 16:50:57 peer connection state changed: connected
sctp DEBUG: 16:50:57.326479 association.go:778: [0xc0008ec1a0] state change: 'Closed' => 'CookieWait'
sctp DEBUG: 16:50:57.326563 association.go:323: [0xc0008ec1a0] sending INIT
sctp DEBUG: 16:50:57.326504 association.go:414: [0xc0008ec1a0] readLoop entered
sctp DEBUG: 16:50:57.326627 association.go:438: [0xc0008ec1a0] writeLoop entered
sctp DEBUG: 16:50:57.327773 association.go:875: [0xc0008ec1a0] chunkInitAck received in state 'CookieWait'
sctp DEBUG: 16:50:57.327814 association.go:897: [0xc0008ec1a0] initial rwnd=131072
sctp DEBUG: 16:50:57.327845 association.go:349: [0xc0008ec1a0] sending COOKIE-ECHO
sctp DEBUG: 16:50:57.327870 association.go:778: [0xc0008ec1a0] state change: 'CookieWait' => 'CookieEchoed'
sctp DEBUG: 16:50:57.328894 association.go:995: [0xc0008ec1a0] COOKIE-ACK received in state 'CookieEchoed'
sctp DEBUG: 16:50:57.328939 association.go:778: [0xc0008ec1a0] state change: 'CookieEchoed' => 'Established'
sctp DEBUG: 16:50:57.328979 association.go:1146: [0xc0008ec1a0] accepted a new stream (streamIdentifier: 1)
sctp DEBUG: 16:50:57.329014 stream.go:118: [1:0xc0008ec1a0] reassemblyQueue readable=true
sctp DEBUG: 16:50:57.329067 stream.go:120: [1:0xc0008ec1a0] readNotifier.signal()
sctp DEBUG: 16:50:57.329097 stream.go:122: [1:0xc0008ec1a0] readNotifier.signal() done
ortc WARNING: 2020/06/22 16:50:57 webrtc.DetachDataChannels() enabled but didn't Detach, call Detach from OnOpen
pc DEBUG: 16:50:57.342435 peerconnection.go:280: got new track: &{mu:{w:{state:0 sema:0} writerSem:0 readerSem:0 readerCount:0 readerWait:0} id:435a165f-5faf-4c8d-a802-3be6d4e38e5c payloadType:111 kind:1 label:- ssrc:2047485769 codec:0xc000178930 packetizer:<nil> receiver:0xc0001366c0 activeSenders:[] totalSenderCount:0}
onTrack type=111 codec=opus
sctp DEBUG: 16:50:57.529259 association.go:650: [0xc0008ec1a0] sending SACK: SACK cumTsnAck=2488767936 arwnd=1048576 dupTsn=[]
sctp DEBUG: 16:51:01.820344 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=1 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:51:03.820602 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=2 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:51:07.820821 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=3 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:51:15.821210 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=4 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:51:31.821456 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=5 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:52:03.821745 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=6 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:54:03.822332 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=8 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:54:03.822332 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=8 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:55:03.822567 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=9 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:56:03.822903 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=10 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:57:03.823265 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=11 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:58:03.823586 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=12 cwnd=1228 ssthresh=4912
sctp DEBUG: 16:59:03.823803 association.go:2113: [0xc0008ec1a0] T3-rtx timed out: nRtos=13 cwnd=1228 ssthresh=4912
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 19 (15 by maintainers)
Commits related to this issue
- extract useForwardTSN on InitAck also Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- extract useForwardTSN in handleInit Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- set reliability params on DCEP ACK Relates to pion/webrtc#1270 — committed to pion/datachannel by enobufs 4 years ago
- rm annoying log and fixed lint error Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix test errors Relates to pion/webrtc#1270 — committed to pion/datachannel by enobufs 4 years ago
- Extract useForwardTSN in handleInit Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Set reliability params on DCEP ACK Relates to pion/webrtc#1270 — committed to pion/datachannel by enobufs 4 years ago
- Set reliability params on DCEP ACK Relates to pion/webrtc#1270 — committed to pion/datachannel by enobufs 4 years ago
- Extract useForwardTSN in handleInit Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix handling of retransmitted cookie-echo Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix test failing with timeout too short Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Remove a test that no longer works This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#127... — committed to pion/sctp by enobufs 4 years ago
- Fix handling of retransmitted cookie-echo Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix test failing with timeout too short Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Remove a test that no longer works This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#127... — committed to pion/sctp by enobufs 4 years ago
- Fix handling of retransmitted cookie-echo Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix test failing with timeout too short Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Remove a test that no longer works This is related to #62 in which we removed the use of immediate ack (I) bit. This test became unstable as we no longer use immediate ack. Relates to pion/webrtc#127... — committed to pion/sctp by enobufs 4 years ago
- Fix handling of retransmitted cookie-echo Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
- Fix test failing with timeout too short Relates to pion/webrtc#1270 — committed to pion/sctp by enobufs 4 years ago
@Sean-Der @at-wat With the current pion/sctp and pion/datachannel master heads, data channel transmission over lossy connection is very stable to me and I no longer see any hang even with 20% of (extremely bad) packet loss. So far I have done the following combinations:
1:1 connection
1:2 connections
As to reliability options I have done both of these settings:
With the fixes (pion/sctp#127, pion/datachannel#64, pion/sctp#130) which are all landed on master, data channel got very robust and stable. Please let me tag the latest pion/sctp and pion/datachannel.
@at-wat I updated sctp and datachannel versions to the latest on v2 branch. FYI.