libzmq: zeromq can hang indefinitely in zmq::ctx_t::terminate() despite LINGER=0

I have an edge-case in a Python program where ZeroMQ can hang indefinitely. I have an extremely hard time isolating the bug; the most reliable way for me to reproduce it is to repeatedly run our entire test suite until that suite hangs (every 10th time on average). I’ve already posted a corresponding bug report at https://github.com/zeromq/pyzmq/issues/1003, but I think I’ve found out a bit more now which hopefully justifies creating an issue here.

The basic problem is that I have created contexts and sockets all with LINGER=0 and BLOCKY=False, and still, sometimes it hangs forever in zmq::ctx_t_terminate(). I am using zeromq 4.2.1. Looking at the GDB output, I can see that it hangs in https://github.com/zeromq/libzmq/blob/v4.2.1/src/ctx.cpp#L194. Here is the GDB output:

(gdb) frame
#4  0x00007f1d61ffcccd in zmq::ctx_t::terminate (this=0x4984290) at src/ctx.cpp:194
194	        int rc = term_mailbox.recv (&cmd, -1);
(gdb) p *this
$13 = {tag = 2880293630, sockets = {items = std::vector of length 48, capacity 64 = {0x48c6860, 0x5ca93d0,
      0x49d4bc0, 0x5c93cc0, 0x5bf5ff0, 0x4941160, 0x5ce6780, 0x5daee30, 0x5d03450, 0x5d0e030, 0x5dbf220, 0x49ccb20,
      0x49026d0, 0x4f375c0, 0x5d0ae60, 0x5e60cd0, 0x5d1b370, 0x5b00010, 0x4a14470, 0x4b0fb40, 0x4f247b0, 0x5c4fd60,
      0x48dd020, 0x5c4ed80, 0x4988070, 0x5d11ca0, 0x5c7b840, 0x4976780, 0x5aed1b0, 0x5cb16e0, 0x4944f50, 0x5d34bb0,
      0x4d79b60, 0x4a2fc90, 0x4c3b020, 0x5e64e60, 0x4eedc40, 0x4f87570, 0x5e5cd50, 0x4f41b40, 0x5e1ea80, 0x5afacd0,
      0x4acf650, 0x48c4640, 0x4f40a10, 0x5b57c60, 0x5b38b80, 0x5e66cc0}},
  empty_slots = std::vector of length 975, capacity 1024 = {1025, 1024, 1023, 1022, 1021, 1020, 1019, 1018, 1017,
    1016, 1015, 1014, 1013, 1012, 1011, 1010, 1009, 1008, 1007, 1006, 1005, 1004, 1003, 1002, 1001, 1000, 999, 998,
    997, 996, 995, 994, 993, 992, 991, 990, 989, 988, 987, 986, 985, 984, 983, 982, 981, 980, 979, 978, 977, 976,
    975, 974, 973, 972, 971, 970, 969, 968, 967, 966, 965, 964, 963, 962, 961, 960, 959, 958, 957, 956, 955, 954,
    953, 952, 951, 950, 949, 948, 947, 946, 945, 944, 943, 942, 941, 940, 939, 938, 937, 936, 935, 934, 933, 932,
    931, 930, 929, 928, 927, 926, 925, 924, 923, 922, 921, 920, 919, 918, 917, 916, 915, 914, 913, 912, 911, 910,
    909, 908, 907, 906, 905, 904, 903, 902, 901, 900, 899, 898, 897, 896, 895, 894, 893, 892, 891, 890, 889, 888,
    887, 886, 885, 884, 883, 882, 881, 880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 870, 869, 868, 867, 866,
    865, 864, 863, 862, 861, 860, 859, 858, 857, 856, 855, 854, 853, 852, 851, 850, 849, 848, 847, 846, 845, 844,
    843, 842, 841, 840, 839, 838, 837, 836, 835, 834, 833, 832, 831, 830, 829, 828, 827, 826...}, starting = false,
  terminating = true, slot_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
        __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, reaper = 0x5db58c0,
  io_threads = std::vector of length 1, capacity 1 = {0x5dda470}, slot_count = 1026, slots = 0x5dd5a80,
  term_mailbox = {<zmq::i_mailbox> = {_vptr.i_mailbox = 0x7f1d622633f0 <vtable for zmq::mailbox_t+16>},
    cpipe = {<zmq::ypipe_base_t<zmq::command_t>> = {
        _vptr.ypipe_base_t = 0x7f1d62263390 <vtable for zmq::ypipe_t<zmq::command_t, 16>+16>}, queue = {
        begin_chunk = 0x495c3c0, begin_pos = 0, back_chunk = 0x495c3c0, back_pos = 0, end_chunk = 0x495c3c0,
        end_pos = 1, spare_chunk = {ptr = 0x0}}, w = 0x495c3c0, r = 0x495c3c0, f = 0x495c3c0, c = {ptr = 0x0}},
    signaler = {w = 17, r = 17, pid = 9671}, sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
          __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
        __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
        __size = "\001\000\000", __align = 1}}, active = false}, endpoints = std::map with 0 elements,
  pending_connections = std::multimap with 0 elements, endpoints_sync = {mutex = {__data = {__lock = 0, __count = 0,
        __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, static max_socket_id = {value = 100}, max_sockets = 1023,
  max_msgsz = 2147483647, io_thread_count = 1, blocky = false, ipv6 = false, thread_priority = -1,
  thread_sched_policy = -1, opt_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0,
        __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}, pid = 9671, crypto_sync = {mutex = {__data = {__lock = 0, __count = 0,
        __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}

And the first 30 frames of the stack trace:

#0  0x00007f1d68dcc84d in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62025eca in poll (__timeout=-1, __nfds=1, __fds=0x7ffda9fc7dd0)
    at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
#2  zmq::signaler_t::wait (this=this@entry=0x4984398, timeout_=timeout_@entry=-1) at src/signaler.cpp:228
#3  0x00007f1d6200d0b5 in zmq::mailbox_t::recv (this=this@entry=0x4984330, cmd_=cmd_@entry=0x7ffda9fc7e80,
    timeout_=timeout_@entry=-1) at src/mailbox.cpp:81
#4  0x00007f1d61ffcccd in zmq::ctx_t::terminate (this=0x4984290) at src/ctx.cpp:194
#5  0x00007f1d614a4ec8 in __pyx_f_3zmq_7backend_6cython_7context_7Context__term (__pyx_v_self=0x7f1d0232cdd8)
    at zmq/backend/cython/context.c:2210
#6  __pyx_pf_3zmq_7backend_6cython_7context_7Context_6term (__pyx_v_self=0x7f1d0232cdd8)
    at zmq/backend/cython/context.c:2323
#7  __pyx_pw_3zmq_7backend_6cython_7context_7Context_7term (__pyx_v_self=0x7f1d0232cdd8, unused=<optimized out>)
    at zmq/backend/cython/context.c:2289
#8  0x00007f1d6226ae29 in __Pyx_CyFunction_CallAsMethod (func=0x7f1d626838e8, args=0x7f1d4edfcac8, kw=0x0)
    at zmq/backend/cython/error.c:2013
#9  0x0000000000453fd2 in _PyObject_FastCallDict ()
#10 0x0000000000542cc5 in ?? ()
#11 0x000000000054692c in _PyEval_EvalFrameDefault ()
#12 0x0000000000541f01 in ?? ()
#13 0x000000000054bc86 in _PyFunction_FastCallDict ()
#14 0x000000000045411f in _PyObject_FastCallDict ()
#15 0x000000000045421b in _PyObject_Call_Prepend ()
#16 0x0000000000453fd2 in _PyObject_FastCallDict ()
#17 0x00000000004ca6d1 in ?? ()
#18 0x000000000043e2b1 in ?? ()
#19 0x000000000043ed89 in ?? ()
#20 0x000000000043f581 in _PyObject_GC_NewVar ()
#21 0x00000000004bcfd5 in PyTuple_New ()
#22 0x0000000000545665 in _PyEval_EvalFrameDefault ()
#23 0x0000000000541f01 in ?? ()
#24 0x0000000000543097 in ?? ()
#25 0x000000000054692c in _PyEval_EvalFrameDefault ()
#26 0x0000000000542b71 in ?? ()
#27 0x000000000054bbaa in _PyFunction_FastCallDict ()
#28 0x000000000045411f in _PyObject_FastCallDict ()
#29 0x000000000045421b in _PyObject_Call_Prepend ()
#30 0x0000000000453df0 in PyObject_Call ()

Unfortunately, I have not yet been able reproduce this bug in a debug build of Python, hence the nondescriptive stacktrace in the Python part.

Looking at the terminate() function, I don’t understand the rationale of letting int rc = term_mailbox.recv (&cmd, -1) wait forever. It is my understanding that I specifically set LINGER=0 to not have this behaviour. What am I missing?

Update: I thought it might be useful to include thread information. There are 11 threads in total; the backtrace above is from thread 1. Threads 4-11 are from Tensorflow, and can probably be neglected. I don’t know what threads 2 and 3 are good for. Given that term_mailbox.recv waits for the reaper thread to terminate all sockets (according to the comment above said line of code), I wonder which one that would be.

(gdb) info threads
  Id   Target Id         Frame
  11   Thread 0x7f1d2dc6a700 (LWP 9675) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  10   Thread 0x7f1d2d469700 (LWP 9676) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  9    Thread 0x7f1d2cc68700 (LWP 9677) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  8    Thread 0x7f1d2c467700 (LWP 9678) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  7    Thread 0x7f1d2bc66700 (LWP 9679) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  6    Thread 0x7f1d2b465700 (LWP 9680) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  5    Thread 0x7f1d2ac64700 (LWP 9681) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  4    Thread 0x7f1d2a463700 (LWP 9682) "py.test" pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  3    Thread 0x7f1ce0fff700 (LWP 9990) "py.test" 0x00007f1d68dda283 in epoll_wait ()
    at ../sysdeps/unix/syscall-template.S:81
  2    Thread 0x7f1ce8dfb700 (LWP 9991) "py.test" 0x00007f1d68dda283 in epoll_wait ()
    at ../sysdeps/unix/syscall-template.S:81
* 1    Thread 0x7f1d6a023740 (LWP 9671) "py.test" 0x00007f1d68dcc84d in poll ()
    at ../sysdeps/unix/syscall-template.S:81

The backtraces of threads 4 to 11 are identical:

(gdb) thread 4
[Switching to thread 4 (Thread 0x7f1d2a463700 (LWP 9682))]
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185	../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.
(gdb) bt
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f1d61b864dc in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()
   from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007f1d39a79fc7 in Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::WaitForWork(Eigen::EventCount::Waiter*, tensorflow::thread::EigenEnvironment::Task*) ()
   from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#3  0x00007f1d39a7a847 in std::_Function_handler<void (), Eigen::NonBlockingThreadPoolTempl<tensorflow::thread::EigenEnvironment>::NonBlockingThreadPoolTempl(int, tensorflow::thread::EigenEnvironment)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#4  0x00007f1d39a79950 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
   from /sonalytic/.venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow.so
#5  0x00007f1d61b89a60 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f1d69bfc184 in start_thread (arg=0x7f1d2a463700) at pthread_create.c:312
#7  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 and 3’s backtraces are identical, too, but their *this variable in frame 1 differ significantly:

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f1ce8dfb700 (LWP 9991))]
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
81	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x5ad0220) at src/epoll.cpp:168
#2  0x00007f1d6203b345 in thread_routine (arg_=0x5ad02a0) at src/thread.cpp:100
#3  0x00007f1d69bfc184 in start_thread (arg=0x7f1ce8dfb700) at pthread_create.c:312
#4  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) up
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x5ad0220) at src/epoll.cpp:168
168	            timeout ? timeout : -1);
(gdb) p *this
$22 = {<zmq::poller_base_t> = {_vptr.poller_base_t = 0x7f1d62262bf0 <vtable for zmq::epoll_t+16>, clock = {
      last_tsc = 93651029260469, last_time = 29439022}, timers = std::multimap with 48 elements = {[29443487] = {
        sink = 0x7f1cf40a2390, id = 1}, [29443499] = {sink = 0x7f1cf40aee80, id = 1}, [29443561] = {
        sink = 0x7f1cf4011170, id = 1}, [29443566] = {sink = 0x7f1cf40a9090, id = 1}, [29443594] = {
        sink = 0x7f1cf4002030, id = 1}, [29443600] = {sink = 0x7f1cf40b0990, id = 1}, [29443607] = {
        sink = 0x7f1cf40b05e0, id = 1}, [29443607] = {sink = 0x7f1cf4012cc0, id = 1}, [29443620] = {
        sink = 0x7f1cf4017c70, id = 1}, [29443628] = {sink = 0x7f1cf40a0f00, id = 1}, [29443653] = {
        sink = 0x7f1cf4016eb0, id = 1}, [29443666] = {sink = 0x7f1cf40a46f0, id = 1}, [29443667] = {
        sink = 0x7f1cf40a4ed0, id = 1}, [29443668] = {sink = 0x7f1cf40b0120, id = 1}, [29443671] = {
        sink = 0x7f1cf40adfd0, id = 1}, [29443674] = {sink = 0x7f1cf40afc60, id = 1}, [29443697] = {
        sink = 0x7f1cf40b13b0, id = 1}, [29443712] = {sink = 0x7f1cf40a1a20, id = 1}, [29443719] = {
        sink = 0x7f1cf40af6c0, id = 1}, [29443739] = {sink = 0x7f1cf417c780, id = 1}, [29443749] = {
        sink = 0x7f1cf40a6700, id = 1}, [29443762] = {sink = 0x7f1cf4013070, id = 1}, [29443780] = {
        sink = 0x7f1cf40b1d30, id = 1}, [29443782] = {sink = 0x7f1cf40ae380, id = 1}, [29443787] = {
        sink = 0x7f1cf40a6af0, id = 1}, [29443797] = {sink = 0x7f1cf40a7ab0, id = 1}, [29443803] = {
        sink = 0x7f1cf40a1ed0, id = 1}, [29443819] = {sink = 0x7f1cf40a3b20, id = 1}, [29443823] = {
        sink = 0x7f1cf40b0f20, id = 1}, [29443845] = {sink = 0x7f1cf4012820, id = 1}, [29443854] = {
        sink = 0x7f1cf406b700, id = 1}, [29443864] = {sink = 0x7f1cf40a2f50, id = 1}, [29443865] = {
        sink = 0x7f1cf40a9440, id = 1}, [29443879] = {sink = 0x7f1cf40af310, id = 1}, [29443895] = {
        sink = 0x7f1cf40b1870, id = 1}, [29443912] = {sink = 0x7f1cf40a4300, id = 1}, [29443922] = {
        sink = 0x7f1cf4010780, id = 1}, [29443955] = {sink = 0x7f1cf40ae9f0, id = 1}, [29443959] = {
        sink = 0x7f1cf40b21f0, id = 1}, [29443965] = {sink = 0x7f1cf40a7ea0, id = 1}, [29443979] = {
        sink = 0x7f1cf40a3340, id = 1}, [29443997] = {sink = 0x7f1cf4010b30, id = 1}, [29443998] = {
        sink = 0x7f1cf40b2850, id = 1}, [29444011] = {sink = 0x7f1cf40b2eb0, id = 1}, [29444020] = {
        sink = 0x7f1cf40a56b0, id = 1}, [29444076] = {sink = 0x7f1cf4012130, id = 1}, [29444106] = {
        sink = 0x7f1cf40adaf0, id = 1}, [29444111] = {sink = 0x7f1cf40a0aa0, id = 1}}, load = {value = 1}},
  ctx = @0x4984290, epoll_fd = 30, retired = std::vector of length 0, capacity 64, stopping = false, worker = {
    tfn = 0x7f1d62008c60 <zmq::epoll_t::worker_routine(void*)>, arg = 0x5ad0220, descriptor = 139762142787328},
  retired_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0,
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}
(gdb) thread 3
[Switching to thread 3 (Thread 0x7f1ce0fff700 (LWP 9990))]
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
81	in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x00007f1d68dda283 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x4b2c290) at src/epoll.cpp:168
#2  0x00007f1d6203b345 in thread_routine (arg_=0x4b2c310) at src/thread.cpp:100
#3  0x00007f1d69bfc184 in start_thread (arg=0x7f1ce0fff700) at pthread_create.c:312
#4  0x00007f1d68dd9bed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) up
#1  0x00007f1d62008a4c in zmq::epoll_t::loop (this=0x4b2c290) at src/epoll.cpp:168
168	            timeout ? timeout : -1);
(gdb) p *this
$21 = {<zmq::poller_base_t> = {_vptr.poller_base_t = 0x7f1d62262bf0 <vtable for zmq::epoll_t+16>, clock = {
      last_tsc = 93281918731244, last_time = 29311916}, timers = std::multimap with 0 elements, load = {value = 1}},
  ctx = @0x4984290, epoll_fd = 28, retired = std::vector of length 0, capacity 2, stopping = false, worker = {
    tfn = 0x7f1d62008c60 <zmq::epoll_t::worker_routine(void*)>, arg = 0x4b2c290, descriptor = 139762010683136},
  retired_sync = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0,
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}, attr = {
      __size = "\001\000\000", __align = 1}}}

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 35 (21 by maintainers)

Most upvoted comments

Yes, thanks for the advice. It’s one of the reasons I chose the asyncio module, which runs single-threaded in an event loop. There should be no sockets in other threads.

Right now, I’m stuck in trying to recreate the race condition in an environment where I can backtrace the garbage collection. It would be great if ZeroMQ had a builtin way to avoid this deadlock from the start; it’s easy to detect if a program has crashed, but hard to detect if a program is hanging. Note, also, that if I send a SIGTERM to the program (the python test suite in this case), it stops hanging and continues to work normally. This may be expected as probably the SIGTERM is caught and swallowed by ZeroMQ.