uWebSockets: Segfault when connections are closing during publish()

I am running into a situation where the library has a seg fault while trying to publish on the websockets. During the call to publish(), the library calls into drain(), which calls the cb() function of TopicTree. While executing that function, the segfault occurs because the Subscriber pointer is reset to UINTPTR_MAX.

I could not cause this problem with a single client. The corner case is occurring when I have thousands of clients connected while publishing a large number of updates. As I end up overwhelming the clients, I see TCP retransmits and TCP windows reach 0. As the congestion occurs, some clients begin closing sockets. When this happens, TopicTree.h line 374 cb(min, intersectionCache[intersection] = std::move(res)); is eventually called with min set to UINTPTR_MAX. This then causes the seg fault in WebSocketContextData.h on line 61 auto *asyncSocket = (AsyncSocket<SSL> *) s->user; when trying to access the user.

I’ve made two changes to workaround this, but am not sure if this is the best or proper solution.

In WebSocketContextData.h, I added the if check on the subscriber pointer:

    WebSocketContextData() : topicTree([this](Subscriber *s, std::string_view data) -> int {
        /* We rely on writing to regular asyncSockets */
        if (s != (Subscriber *)UINTPTR_MAX) {
            auto *asyncSocket = (AsyncSocket<SSL> *) s->user;

And in TopicTree.h in the drain() method around line 350, I added the additional code to decrement nonEmpty.

                    else {
                        /* We need to lower nextMin to us, in the case of min being the last in a set */
                        if ((it[i] != end[i]) && (nextMin > *it[i])) {
                            nextMin = *it[i];
                        } else if (min == (Subscriber *)UINTPTR_MAX) {
                            if (nonEmpty > 0) {
                                nonEmpty--;
                            }
                        }
                    }

Also the backtrace from gdb.

(gdb) bt
#0  0x00007f847a010837 in ioctl () from /lib64/libc.so.6
#1  0x00007f847cb854bf in oo_resource_op (cmd=1074027035, io=0x7f84745ce65c, fp=<optimized out>) at ../../../../../src/include/onload/unix_intf.h:80
#2  citp_signal_terminate (signum=<optimized out>, info=<optimized out>, context=<optimized out>) at ../../../../../src/lib/transport/ip/signal.c:280
#3  0x00007f847cb85412 in citp_signal_run_app_handler (sig=sig@entry=11, info=info@entry=0x7f84745ce8f0, context=context@entry=0x7f84745ce7c0) at ../../../../../src/lib/transport/ip/signal.c:106
#4  0x00007f847cb8565a in citp_signal_run_now (our_info=<optimized out>, context=0x7f84745ce7c0, info=0x7f84745ce8f0, signum=11) at ../../../../../src/lib/transport/ip/signal.c:226
#5  citp_signal_intercept (signum=11, info=0x7f84745ce8f0, context=0x7f84745ce7c0) at ../../../../../src/lib/transport/ip/signal.c:263
#6  <signal handler called>
#7  0x0000000000446f66 in std::_Function_handler<int (uWS::Subscriber*, std::basic_string_view<char, std::char_traits<char> >), uWS::WebSocketContextData<false>::WebSocketContextData()::{lambda(uWS::Subscriber*, std::basic_string_view<char, std::char_traits<char> >)#1}>::_M_invoke(std::_Any_data const&, uWS::Subscriber*&&, std::basic_string_view<char, std::char_traits<char> >&&) (__functor=...,
    __args#0=<optimized out>, __args#1=...) at /opt/x3p/xrt_b/base/lib/gcc/x86_64-unknown-linux/7/../../../../include/c++/7/bits/std_function.h:299
#8  0x00000000004277e8 in std::function<int (uWS::Subscriber*, std::basic_string_view<char, std::char_traits<char> >)>::operator()(uWS::Subscriber*, std::basic_string_view<char, std::char_traits<char> >) const (__args#1="", __args#0=<optimized out>, this=0x1f4cd48) at /opt/x3p/xrt_b/base/lib/gcc/x86_64-unknown-linux/7/../../../../include/c++/7/bits/std_function.h:706
#9  uWS::TopicTree::drain (this=0x1f4cd48) at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/TopicTree.h:374
#10 0x0000000000411bd7 in fu2::abi_400::detail::type_erasure::tables::vtable<fu2::abi_400::detail::property<true, false, void (uWS::Loop*)> >::invoke<0ul, fu2::abi_400::detail::type_erasure::data_accessor*, unsigned long const&, uWS::Loop*>(fu2::abi_400::detail::type_erasure::data_accessor*&&, unsigned long const&, uWS::Loop*&&) const (this=0x1ec7df0)
    at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/f2/function2.hpp:995
#11 fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_default>, fu2::abi_400::detail::property<true, false, void (uWS::Loop*)> >::invoke<0ul, fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_default>, fu2::abi_400::detail::property<true, false, void (uWS::Loop*)> >&, uWS::Loop*>(fu2::abi_400::detail::type_erasure::erasure<true, fu2::abi_400::detail::config<true, false, fu2::capacity_default>, fu2::abi_400::detail::property<true, false, void (uWS::Loop*)> >&, uWS::Loop*&&) (erasure=...)
    at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/f2/function2.hpp:1215
#12 fu2::abi_400::detail::type_erasure::invocation_table::operator_impl<0ul, fu2::abi_400::detail::function<fu2::abi_400::detail::config<true, false, fu2::capacity_default>, fu2::abi_400::detail::property<true, false, void (uWS::Loop*)> >, void (uWS::Loop*)>::operator()(uWS::Loop*) (args#0=0x384e000, this=0x1ec7de0)
    at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/f2/function2.hpp:775
#13 uWS::Loop::postCb (loop=0x384e000) at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/Loop.h:57
#14 0x0000000000706a88 in us_loop_run (loop=0x384e000) at external/uSockets/src/eventing/epoll_kqueue.c:149
#15 0x00000000004557e4 in uWS::Loop::run (this=<optimized out>) at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/Loop.h:152
#16 uWS::run () at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/Loop.h:164
#17 uWS::TemplatedApp<false>::run (this=<optimized out>) at bazel-out/k8-opt/bin/external/uWebSockets/_virtual_includes/uWebSockets/uWebSockets/App.h:334

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 36 (1 by maintainers)

Most upvoted comments

Alright, @cojmar reported 5 days in production without any crashes so this bug is officially fixed 😃