beast: tcp_stream causing SIGABRT | Boost 1.73.0
I am hitting a crash in my websocket server due to sigabrt caused by
boost/beast/core/detail/stream_base.hpp:81
All the read/write/close operations are being done in the same strand on the io-context for this web socket server. Upon, Vinnie’s suggestion I replaced tcp_stream with tcp socket, and the crash went away. Tries with tcp_stream and only one single thread in io-context as well, and the crash happened again, implicating the issue is probably in tcp_stream.
I am attaching the stack trace file here, which has some additional info added in it. The crash occurrence has a timestamp as well, and if we look at step # 31, I have pasted timestamp at this point as well. The thread which crashed seems to be stuck somewhere for more than 3 minutes trying to do a write operation.
The crash is a bit rare but is reproducible under load. Even without significant load the program crashes sometimes with same stacktrace.
I cannot provide the whole transport layer here as the code is proprietary, but if needed we can figure out a way to work around that.
The program is running on
CentOS Linux release 7.9.2009 (Core) compiled with gcc (GCC) 8.3.1 20190311 (Red Hat 8.3.1-3).
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 30 (4 by maintainers)
@madmongo1 @mhassanshafiq This issue showed up again - just to give some context -
This is my understanding of the stack trace. I see close op being called and triggers an assert the same way the assert hits on wr_impl.is_locked check - clearly is not MT issue here.
There are some very sensitive scenarios where I see this happening. The same code with same scenario with less load doesnt trigger this assert
I understand its very difficult to go by just the symptoms we are stating here but there is some code path in the close.hpp which is clearly very delicate but would be great if this can be checked and fixed. Its like a time bomb ticking and I dont think its actually a beast stream issue but some logic in the close code path IMHO
Cheers
This is the suggestion I followed to work around the crash. @gopalak
just switch to an ssl stream with a regular ASIO socket, and implement the timeout yourself.