node-datachannel: Crash in Node server after browser client closes RTCPeerConnection

After a successful WebRTC connection (offer created by Node server) and verifying that messages were sent and received over a RTCDataChannel, the browser client runs RTCPeerConnection.close(). This crashed the Node server with:

FATAL ERROR: Error::New napi_get_last_error_info
 1: 0xb7a940 node::Abort() [node]
 2: 0xa8e72f node::FatalError(char const*, char const*) [node]
 3: 0xa8e738 node::OOMErrorHandler(char const*, bool) [node]
 4: 0xb40ac3 napi_fatal_error [node]
 5: 0x7fbd2ce055f0 Napi::CallbackScope::~CallbackScope() [<REPO_PATH>/node_modules/node-datachannel/build/Release/node_datachannel.node]
 6: 0x7fbd2ce05918 Napi::Error::New(napi_env__*) [<REPO_PATH>/node_modules/node-datachannel/build/Release/node_datachannel.node]
 7: 0x7fbd2cdeb5d3  [<REPO_PATH>/node_modules/node-datachannel/build/Release/node_datachannel.node]
 8: 0x7fbd2ce41255 ThreadSafeCallback::callbackFunc(Napi::Env, Napi::Function, Napi::Reference<Napi::Value>*, ThreadSafeCallback::CallbackData*) [<REPO_PATH>/node_modules/node-datachannel/build/Release/node_datachannel.node]
 9: 0xb4083c  [node]
10: 0x165f601 uv_run [node]
11: 0xabda6d node::SpinEventLoop(node::Environment*) [node]
12: 0xbc1164 node::NodeMainInstance::Run() [node]
13: 0xb35bc8 node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) [node]
14: 0xb3976f node::Start(int, char**) [node]
15: 0x7fbd34d00d0a __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
16: 0xabbdee _start [node]
Aborted (core dumped)

Environment:

  • node-datachannel version: 0.4.3
  • Docker image node:18.16.1-bullseye-slim running on DigitalOcean droplet
  • uname -a inside Docker container: Linux do-sfo3-0-dev-swarm-drone-demand-node-pool-f9ed0 6.1.0-0.deb11.6-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.15-1~bpo11+1 (2023-03-16) x86_64 GNU/Linux

I can swap between node-datachannel and wrtc for debugging purposes. When using wrtc, I don’t see a crash like this on the Node server after running RTCPeerConnection.close() on the browser client

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 19 (12 by maintainers)

Most upvoted comments

@regnaio Could you please test https://github.com/murat-dogan/node-datachannel/tree/issue-176

@murat-dogan Thanks to your help, I was able to test the change and could not reproduce the crash with the change

How I tested:

Oh, I misunderstood, sorry. Thank you for working on a fix!

I didn’t mean that either. I meant if we were deleting from instance list then cb will not be called. The solution is not related to that.

Thank you @regnaio for detailed instructions.

This problem is related to https://github.com/murat-dogan/node-datachannel/issues/87 In order to receive close events we need to keep callbacks, but that has side effects.

I added log outputs for this issue and for future issues. I got this log;

dataChannel.onMessage(): msg: <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00>
2023-08-24 14:38:17.608 DEBUG [22488] [rtc::openssl::check@119] OpenSSL connection cleanly closed
2023-08-24 14:38:17.608 INFO  [22488] [rtc::impl::DtlsTransport::runRecvLoop@635] DTLS closed
2023-08-24 14:38:17.608 INFO  [22488] [rtc::impl::PeerConnection::changeState@1144] Changed state to disconnected
2023-08-24 14:38:17.609 INFO  [22488] [rtc::impl::SctpTransport::incoming@454] SCTP disconnected
2023-08-24 14:38:17.609 DEBUG [22488] [DataChannelWrapper::onClosed@344] onClosed cb received from rtc
2023-08-24 14:38:17.609 DEBUG [22373] [DataChannelWrapper::onClosed@347] mOnClosedCallback call
2023-08-24 14:38:17.609 DEBUG [22391] [PeerConnectionWrapper::onStateChange@704] onStateChange cb received from rtc
2023-08-24 14:38:17.609 INFO  [22391] [rtc::impl::PeerConnection::changeState@1144] Changed state to closed
2023-08-24 14:38:17.609 DEBUG [22391] [PeerConnectionWrapper::onStateChange@704] onStateChange cb received from rtc
2023-08-24 14:38:17.608 INFO  [22389] [rtc::impl::SctpTransport::processNotification@848] SCTP disconnected
dataChannel.onClosed()
2023-08-24 14:38:17.609 DEBUG [22389] [rtc::impl::SctpTransport::trySendQueue@572] SCTP shutdown
2023-08-24 14:38:17.609 DEBUG [22373] [PeerConnectionWrapper::onStateChange@683] onStateChange() called
2023-08-24 14:38:17.609 DEBUG [22389] [rtc::impl::SctpTransport::~SctpTransport@315] Destroying SCTP transport
2023-08-24 14:38:17.609 DEBUG [22373] [PeerConnectionWrapper::onSignalingStateChange@722] onSignalingStateChange() called
2023-08-24 14:38:17.609 DEBUG [22373] [PeerConnectionWrapper::onGatheringStateChange@762] onGatheringStateChange() called
2023-08-24 14:38:17.609 DEBUG [22373] [PeerConnectionWrapper::onLocalDescription@603] onLocalDescription() called
2023-08-24 14:38:17.609 DEBUG [22389] [rtc::impl::DtlsTransport::stop@478] Stopping DTLS recv thread
2023-08-24 14:38:17.609 DEBUG [22389] [rtc::impl::DtlsTransport::~DtlsTransport@460] Destroying DTLS transport
2023-08-24 14:38:17.609 DEBUG [22373] [PeerConnectionWrapper::onLocalCandidate@643] onLocalCandidate() called
2023-08-24 14:38:17.610 DEBUG [22373] [DataChannelWrapper::onOpen@285] onOpen() called
2023-08-24 14:38:17.610 DEBUG [22373] [DataChannelWrapper::onMessage@434] onMessage() called
2023-08-24 14:38:17.610 DEBUG [22389] [rtc::impl::IceTransport::~IceTransport@151] Destroying ICE transport
2023-08-24 14:38:17.610 DEBUG [22373] [DataChannelWrapper::onBufferedAmountLow@396] onBufferedAmountLow() called
2023-08-24 14:38:17.610 DEBUG [22373] [DataChannelWrapper::onClosed@323] onClosed() called
2023-08-24 14:38:17.610 DEBUG [22373] [DataChannelWrapper::onError@358] onError() called
2023-08-24 14:38:17.610 DEBUG [22373] [PeerConnectionWrapper::close@275] close() called
2023-08-24 14:38:17.610 DEBUG [22373] [PeerConnectionWrapper::doClose@256] doClose() called
2023-08-24 14:38:17.610 DEBUG [22373] [PeerConnectionWrapper::doClose@259] Closing...
2023-08-24 14:38:17.610 DEBUG [22373] [PeerConnectionWrapper::onStateChange@707] mOnStateChangeCallback call(1)
FATAL ERROR: Error::New napi_get_last_error_info
 1: 0xb09980 node::Abort() [node]
 2: 0xa1c235 node::FatalError(char const*, char const*) [node]
 3: 0xa1c23e  [node]
 4: 0xad139b napi_fatal_error [node]
 5: 0x7f370aa6bda0 Napi::CallbackScope::~CallbackScope() [/home/murat/js/node-datachannel/build/Release/node_datachannel.node]
 6: 0x7f370aa6c0c8 Napi::Error::New(napi_env__*) [/home/murat/js/node-datachannel/build/Release/node_datachannel.node]
 7: 0x7f370aa5092b  [/home/murat/js/node-datachannel/build/Release/node_datachannel.node]
 8: 0x7f370aaae095 ThreadSafeCallback::callbackFunc(Napi::Env, Napi::Function, Napi::Reference<Napi::Value>*, ThreadSafeCallback::CallbackData*) [/home/murat/js/node-datachannel/build/Release/node_datachannel.node]
 9: 0xacfddc  [node]
10: 0x1563dd1 uv_run [node]
11: 0xa43dd5 node::SpinEventLoop(node::Environment*) [node]
12: 0xb4bab6 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [node]
13: 0xacd3f2 node::Start(int, char**) [node]
14: 0x7f371c0b1083 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
15: 0xa4076c  [node]
Aborted (core dumped)

So the problem is you are calling peerConnection.close(); before mOnStateChangeCallback call mRtcPeerConnPtr is already destroyed but we didn’t have time to call callbacks yet. Here we don’t reset callbacks for events, so it is still in the instances list. Found a solution, and I am testing it.

Yes I could reproduce it with this example code. I need some time to investigate more.

Just a note, in all cases you should not need reset callbacks manually.

For reference; Crashed log;

dataChannel.onMessage(): msg: <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00>
2023-08-23 00:29:48.796 INFO  [23180] [rtc::impl::SctpTransport::processNotification@848] SCTP disconnected
2023-08-23 00:29:48.796 DEBUG [23274] [rtc::openssl::check@119] OpenSSL connection cleanly closed
2023-08-23 00:29:48.796 INFO  [23274] [rtc::impl::DtlsTransport::runRecvLoop@635] DTLS closed
2023-08-23 00:29:48.796 INFO  [23274] [rtc::impl::SctpTransport::incoming@454] SCTP disconnected
2023-08-23 00:29:48.796 INFO  [23181] [rtc::impl::PeerConnection::changeState@1144] Changed state to closed
2023-08-23 00:29:48.796 INFO  [23180] [rtc::impl::PeerConnection::changeState@1144] Changed state to disconnected
2023-08-23 00:29:48.796 DEBUG [23180] [rtc::impl::SctpTransport::trySendQueue@572] SCTP shutdown
2023-08-23 00:29:48.796 DEBUG [23181] [rtc::impl::SctpTransport::~SctpTransport@315] Destroying SCTP transport
2023-08-23 00:29:48.796 DEBUG [23181] [rtc::impl::DtlsTransport::stop@478] Stopping DTLS recv thread
dataChannel.onClosed()
2023-08-23 00:29:48.796 DEBUG [23181] [rtc::impl::DtlsTransport::~DtlsTransport@460] Destroying DTLS transport
2023-08-23 00:29:48.797 DEBUG [23181] [rtc::impl::IceTransport::~IceTransport@151] Destroying ICE transport
FATAL ERROR: Error::New napi_get_last_error_info
 1: 0xb09980 node::Abort() [node]
 2: 0xa1c235 node::FatalError(char const*, char const*) [node]
 3: 0xa1c23e  [node]
 4: 0xad139b napi_fatal_error [node]
 5: 0x7ff69e2735f0 Napi::CallbackScope::~CallbackScope() [/home/murat/js/temp/test/node_modules/node-datachannel/build/Release/node_datachannel.node]
 6: 0x7ff69e273918 Napi::Error::New(napi_env__*) [/home/murat/js/temp/test/node_modules/node-datachannel/build/Release/node_datachannel.node]
 7: 0x7ff69e2595d3  [/home/murat/js/temp/test/node_modules/node-datachannel/build/Release/node_datachannel.node]
 8: 0x7ff69e2af255 ThreadSafeCallback::callbackFunc(Napi::Env, Napi::Function, Napi::Reference<Napi::Value>*, ThreadSafeCallback::CallbackData*) [/home/murat/js/temp/test/node_modules/node-datachannel/build/Release/node_datachannel.node]
 9: 0xacfddc  [node]
10: 0x1563dd1 uv_run [node]
11: 0xa43dd5 node::SpinEventLoop(node::Environment*) [node]
12: 0xb4bab6 node::NodeMainInstance::Run(node::EnvSerializeInfo const*) [node]
13: 0xacd3f2 node::Start(int, char**) [node]
14: 0x7ff6afb2d083 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
15: 0xa4076c  [node]
Aborted (core dumped)

NOT crashed;

dataChannel.onMessage(): msg: <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00>
2023-08-23 00:36:42.745 DEBUG [23872] [rtc::openssl::check@119] OpenSSL connection cleanly closed
2023-08-23 00:36:42.745 INFO  [23776] [rtc::impl::SctpTransport::processNotification@848] SCTP disconnected
2023-08-23 00:36:42.745 INFO  [23872] [rtc::impl::DtlsTransport::runRecvLoop@635] DTLS closed
2023-08-23 00:36:42.745 INFO  [23776] [rtc::impl::PeerConnection::changeState@1144] Changed state to disconnected
2023-08-23 00:36:42.745 INFO  [23872] [rtc::impl::SctpTransport::incoming@454] SCTP disconnected
dataChannel.onClosed()
2023-08-23 00:36:42.746 DEBUG [23777] [rtc::impl::SctpTransport::trySendQueue@572] SCTP shutdown
2023-08-23 00:36:42.746 INFO  [23776] [rtc::impl::PeerConnection::changeState@1144] Changed state to closed
2023-08-23 00:36:42.746 DEBUG [23777] [rtc::impl::SctpTransport::~SctpTransport@315] Destroying SCTP transport
2023-08-23 00:36:42.746 DEBUG [23777] [rtc::impl::DtlsTransport::stop@478] Stopping DTLS recv thread
2023-08-23 00:36:42.746 DEBUG [23777] [rtc::impl::DtlsTransport::~DtlsTransport@460] Destroying DTLS transport
2023-08-23 00:36:42.746 DEBUG [23777] [rtc::impl::IceTransport::~IceTransport@151] Destroying ICE transport

The issue here seems to be that access to the callbacks is not synchronized in the wrapper, so this code ends up assigning PeerConnectionWrapper::mOnSignalingStateChangeCallback at the same time as it is called in the rtc::PeerConnection::onSignalingStateChange callback:

dataChannel.onClosed(() => {
	[...]
	peerConnection.onStateChange(state => {});
	[...]
});