neko: Can't log back in if all users disconnect and audio device doesn't exist

Hi, Since v2.8, I can’t seem to be able to log back in after one (or multiple) users connect, then all log out. The login prompt times out. A restart of the server is needed for it to work again.

For example: user1 logs in -> user1 logs out => timeout when trying to log back in user1 logs in -> user2 logs in -> user2 logs out => can still connect as long as user1 is connected user1 logs in -> user2 logs in -> user2 logs out -> user3 logs in -> user1 logs out => can still connect as long as user3 is connected user1 logs in -> user2 logs in -> user 2 logs out -> user1 logs out => timeout when trying to log in

Everything was working fine with v2.7 using the exact same config.

Logs from 2.8, connecting/disconnecting with one user:

 nurdism/m1k1o server v2.8.0
9:18AM INF preflight complete config=/etc/neko/neko.yaml debug=false logging=false
9:18AM INF starting neko server service=neko
9:18AM INF setting initial screen size module=desktop screen_size=1280x720@30
9:18AM INF using TCP MUX listener=0.0.0.0:60000 module=webrtc
9:18AM INF using UDP MUX module=webrtc port=60000
9:18AM INF webrtc starting ephemeral_port_range=59000-59100 ice_lite=true ice_servers="[{URLs:[stun:stun.l.google.com:19302] Username: Credential:<nil> CredentialType:password}]" module=webrtc nat_ips=[PublicIP]
9:18AM INF Listening TCP on 0.0.0.0:60000 module=webrtc submodule=pion subsystem=ice-tcp
9:18AM WRN http listening on neko.domain:8080 module=http
9:18AM INF neko ready service=neko
9:18AM INF creating pipeline codec=opus module=capture src="pulsesrc device=audio_output.monitor ! audio/x-raw,channels=2 ! audioconvert ! opusenc inband-fec=true bitrate=320000 ! appsink name=appsinkaudio" submodule=stream-sink video_id=audio
9:18AM INF first listener, starting module=capture submodule=stream-sink video_id=audio
9:18AM INF creating pipeline codec=vp8 module=capture src="ximagesrc display-name=:0.0 show-pointer=true use-damage=false ! video/x-raw,framerate=30/1 ! videoconvert ! queue !  vp8enc target-bitrate=1996800 cpu-used=4 end-usage=cbr threads=4 deadline=1 undershoot=95 buffer-size=12288 buffer-initial-size=6144 buffer-optimal-size=9216 keyframe-max-dist=25 min-quantizer=4 max-quantizer=20  ! appsink name=appsinkvideo" submodule=stream-sink video_id=video
9:18AM INF first listener, starting module=capture submodule=stream-sink video_id=video
9:18AM INF signaling state changed to have-local-offer module=webrtc submodule=pion subsystem=pc
9:18AM WRN negotiation is needed module=webrtc
9:18AM ERR creating offer failed error="InvalidModificationError: invalid proposed signaling state transition: have-local-offer->SetLocal(offer)->have-local-offer" module=webrtc
9:18AM INF signal update - LocalCandidate id=Y-mFZcBPkcZTXi55U8F2XBWX4k4XKPWB module=session
9:18AM INF signal update - LocalCandidate id=Y-mFZcBPkcZTXi55U8F2XBWX4k4XKPWB module=session
9:18AM INF sent all ICECandidates module=webrtc
9:18AM INF signal update - RemoteAnswer id=Y-mFZcBPkcZTXi55U8F2XBWX4k4XKPWB module=session
9:18AM INF signaling state changed to stable module=webrtc submodule=pion subsystem=pc
9:18AM INF Setting new connection state: Checking module=webrtc submodule=pion subsystem=ice
9:18AM INF ICE connection state changed: checking module=webrtc submodule=pion subsystem=pc
9:18AM INF connection state has changed connection_state=checking module=webrtc
9:18AM INF Setting new connection state: Connected module=webrtc submodule=pion subsystem=ice
9:18AM INF ICE connection state changed: connected module=webrtc submodule=pion subsystem=pc
9:18AM INF connection state has changed connection_state=connected module=webrtc
9:18AM INF peer connection state changed: connected module=webrtc submodule=pion subsystem=pc
9:18AM INF peer connected id=Y-mFZcBPkcZTXi55U8F2XBWX4k4XKPWB module=webrtc
9:18AM WRN read message error error="websocket: close 1005 (no status)" module=websocket
9:18AM WRN Failed to accept RTP stream is already closed module=webrtc submodule=pion subsystem=pc
9:18AM WRN Failed to read from candidate tcp4 host [PublicIP]:60000: io: read/write on closed pipe module=webrtc submodule=pion subsystem=ice
9:18AM WRN Failed to accept RTCP stream is already closed module=webrtc submodule=pion subsystem=pc
9:18AM INF Setting new connection state: Closed module=webrtc submodule=pion subsystem=ice
9:18AM INF peer connection state changed: closed module=webrtc submodule=pion subsystem=pc
9:18AM INF ICE connection state changed: closed module=webrtc submodule=pion subsystem=pc
9:18AM INF connection state has changed connection_state=closed module=webrtc
9:18AM INF peer closed id=Y-mFZcBPkcZTXi55U8F2XBWX4k4XKPWB module=webrtc

Same thing but on 2.7:

 nurdism/m1k1o server v2.7.0 dev
9:14AM INF preflight complete config=/etc/neko/neko.yaml debug=false logging=false
9:14AM INF starting neko server service=neko
9:14AM INF setting initial screen size module=desktop screen_size=1280x720@30
9:14AM INF using TCP MUX listener=0.0.0.0:60000 module=webrtc
9:14AM WRN UDPMuxDefault should not listening on unspecified address, use NewMultiUDPMuxFromPort instead module=webrtc submodule=pion subsystem=ice-udp
9:14AM INF using UDP MUX listener=0.0.0.0:60000 module=webrtc
9:14AM INF webrtc starting ephemeral_port_range=59000-59100 ice_lite=true ice_servers="[{URLs:[stun:stun.l.google.com:19302] Username: Credential:<nil> CredentialType:password}]" module=webrtc nat_ips=[PublicIP]
9:14AM INF Listening TCP on 0.0.0.0:60000 module=webrtc submodule=pion subsystem=ice-tcp
9:14AM WRN http listening on neko.domain:8080 module=http
9:14AM INF neko ready service=neko
9:15AM INF creating pipeline codec=opus module=capture src="pulsesrc device=auto_null.monitor ! audio/x-raw,channels=2 ! audioconvert ! opusenc inband-fec=true bitrate=320000 ! appsink name=appsink" submodule=stream-sink video_id=audio
9:15AM INF first listener, starting module=capture submodule=stream-sink video_id=audio
9:15AM INF creating pipeline codec=vp8 module=capture src="ximagesrc display-name=:0.0 show-pointer=true use-damage=false ! video/x-raw,framerate=30/1 ! videoconvert ! queue !  vp8enc target-bitrate=1996800 cpu-used=4 end-usage=cbr threads=4 deadline=1 undershoot=95 buffer-size=12288 buffer-initial-size=6144 buffer-optimal-size=9216 keyframe-max-dist=25 min-quantizer=4 max-quantizer=20  ! appsink name=appsink" submodule=stream-sink video_id=video
9:15AM INF first listener, starting module=capture submodule=stream-sink video_id=video
9:15AM INF signaling state changed to have-local-offer module=webrtc submodule=pion subsystem=pc
9:15AM INF sent all ICECandidates module=webrtc
9:15AM INF signal update - RemoteAnswer id=dyUB-wXo35CaQmE1BknxYpFCWAYvvYho module=session
9:15AM INF signaling state changed to stable module=webrtc submodule=pion subsystem=pc
9:15AM INF Setting new connection state: Checking module=webrtc submodule=pion subsystem=ice
9:15AM INF ICE connection state changed: checking module=webrtc submodule=pion subsystem=pc
9:15AM INF connection state has changed connection_state=checking module=webrtc
9:15AM INF Setting new connection state: Connected module=webrtc submodule=pion subsystem=ice
9:15AM INF ICE connection state changed: connected module=webrtc submodule=pion subsystem=pc
9:15AM INF connection state has changed connection_state=connected module=webrtc
9:15AM INF peer connection state changed: connected module=webrtc submodule=pion subsystem=pc
9:15AM INF peer connected id=dyUB-wXo35CaQmE1BknxYpFCWAYvvYho module=webrtc
9:15AM WRN read message error error="websocket: close 1005 (no status)" module=websocket
9:15AM WRN Failed to accept RTP stream is already closed module=webrtc submodule=pion subsystem=pc
9:15AM INF Setting new connection state: Closed module=webrtc submodule=pion subsystem=ice
9:15AM INF peer connection state changed: closed module=webrtc submodule=pion subsystem=pc
9:15AM WRN Failed to accept RTCP stream is already closed module=webrtc submodule=pion subsystem=pc
9:15AM INF ICE connection state changed: closed module=webrtc submodule=pion subsystem=pc
9:15AM INF connection state has changed connection_state=closed module=webrtc
9:15AM INF peer closed id=dyUB-wXo35CaQmE1BknxYpFCWAYvvYho module=webrtc
9:15AM INF destroying pipeline module=capture submodule=stream-sink video_id=audio
9:15AM INF last listener, stopping module=capture submodule=stream-sink video_id=audio
9:15AM INF destroying pipeline module=capture submodule=stream-sink video_id=video
9:15AM INF last listener, stopping module=capture submodule=stream-sink video_id=video
9:15AM INF destroying session module=session session_id=dyUB-wXo35CaQmE1BknxYpFCWAYvvYho

Seems like the only difference is Failed to read from candidate tcp4 host [PublicIP]:60000: io: read/write on closed pipe module=webrtc submodule=pion subsystem=ice, maybe that’s related?

My config file:

screen: "1280x720@30"
max_fps: 30
video_codec: "vp8"
video_bitrate: 2000
hwenc: "none"
audio_codec: "opus"
audio_bitrate: 256

password: "password"
password_admin: "admin"

bind: "neko.domain:8080"
udpmux: "60000"
tcpmux: "60000"
nat1to1: "[PublicIP]"
icelite: true

display: ":0.0"
static: "/usr/share/neko/dist"

The same thing happens no matter if I use the udpmux or epr setting. I am running Neko directly on an Alpine v3.17 VM, without docker, with nginx as reverse proxy.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

@innerop Thank you for the suggestion.

I am more concerned about why gstreamer did not complain that the audio device does not exist. When trying to set non-existing display -e "NEKO_DISPLAY=foo" it does not work, again does not produce any error, but does not freeze.

Maybe adding context.Context or at least some timeout to the destroy function would be a good solution.

https://github.com/m1k1o/neko/blob/b2080649ea50d942c3e9887813d867edbfc21052/server/internal/capture/gst/gst.go#L96-L105

But that could cause memory leaks in C code when not destroyed properly.

I tried locally starting gstreamer pipeline with nonexisting device and it fails immediatly. Maybe we did not catch that error and we still think that pipeline is running although is not and we attempt to destroy it what causes deadlock.

$ gst-launch-1.0 pulsesrc device=foo ! audio/x-raw,channels=2 ! audioconvert ! opusenc inband-fec=true bitrate=128000 ! autoaudiosink
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Setting pipeline to PLAYING ...
New clock: GstPulseSrcClock
ERROR: from element /GstPipeline:pipeline0/GstPulseSrc:pulsesrc0: Internal data stream error.
Additional debug info:
gstbasesrc.c(3072): gst_base_src_loop (): /GstPipeline:pipeline0/GstPulseSrc:pulsesrc0:
streaming stopped, reason not-negotiated (-4)
Execution ended after 0:00:00.000106250
Setting pipeline to NULL ...
Freeing pipeline ...

Great! So its caused by audio, that means also audio was not working for you. You could also revert to the old one by adding device: auto_null.monitor to your config.

@innerop it can be reproduced by setting device to something that does not exist e.g. -e "NEKO_DEVICE=foo".

It would be still interesting what causes the deadlock, but its probably gstreamer pipeline that cannot stop. When changing screen it stops only video pipeline, so it makes sense.

I’ve identified the problematic commit! Seems like 0dd9597 is what causes the issue. Adding the load-module line from that commit to my pulseaudio config seems to fix the issue.