seastar: Crash/assert in seastar::aio_general_context::flush(): Assertion `clock::now() < retry_until' failed.
Hi folks,
in Redpanda we experience the following assert/crash under heavy load:
seastar::aio_general_context::flush(): Assertion `clock::now() < retry_until' failed.
https://github.com/scylladb/seastar/blob/master/src/core/reactor_backend.cc#L349-L356
[Backtrace #0]
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:60
(inlined by) seastar::backtrace_buffer::append_backtrace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:786
(inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:816
(inlined by) seastar::print_with_backtrace(char const*, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:828
seastar::sigabrt_action() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4065
(inlined by) operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4041
(inlined by) __invoke at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4037
/opt/redpanda/lib/libc.so.6: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=6e7b96dfb83f0bdcb6a410469b82f86415e5ada3, for GNU/Linux 3.2.0, stripped
__sigaction at ??:?
pthread_key_delete at ??:?
raise at ??:?
abort at ??:?
?? ??:0
__assert_fail at ??:?
seastar::aio_general_context::flush() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor_backend.cc:338
(inlined by) seastar::reactor_backend_aio::kernel_submit_work() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor_backend.cc:522
seastar::reactor::poll_once() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3407
(inlined by) operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3300
(inlined by) seastar::reactor::do_run() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3328
operator() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:4522
(inlined by) decltype ((std::declval<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>())()) std::__1::__invoke[abi:v160004]<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:394
(inlined by) void std::__1::__invoke_void_return_wrapper<void, true>::__call<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2&) at /vectorized/llvm/bin/../include/c++/v1/__functional/invoke.h:487
(inlined by) std::__1::__function::__alloc_func<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2, std::__1::allocator<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2>, void ()>::operator()[abi:v160004]() at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:185
(inlined by) std::__1::__function::__func<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2, std::__1::allocator<seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_2>, void ()>::operator()() at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:356
std::__1::__function::__value_func<void ()>::operator()[abi:v160004]() const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:510
(inlined by) std::__1::function<void ()>::operator()() const at /vectorized/llvm/bin/../include/c++/v1/__functional/function.h:1156
(inlined by) seastar::posix_thread::start_routine(void*) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/posix.cc:73
pthread_condattr_setpshared at ??:?
__xmknodat at ??:?
I don’t have an easy reproducer but I can reliably reproduce it with one of our internal load tests if further debugging is needed.
This bit of code seems to have a bit of history and was already changed a couple times:
- https://github.com/scylladb/seastar/commit/1fb42e2776239341935efd7700336ec5eb9bc253
- https://github.com/scylladb/seastar/commit/528762adbaf2c6dde857957b920606378959e8fb
I am not sure I fully understand the code. The comment mentions an “infinite loop” but I guess what it means is a longer stall in flush
? I assume the 1 second was chosen arbitrarily? IMO stalling for 1 (or 5 or 10) seconds is better than crashing?
cc @bhalevy
Thanks, Stephan
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 22 (20 by maintainers)
Commits related to this issue
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to bhalevy/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to bhalevy/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to redpanda-data/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to redpanda-data/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to redpanda-data/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to graphcareful/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to graphcareful/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to redpanda-data/seastar by bhalevy a year ago
- aio_general_context: flush: return short flush on error As seen in https://github.com/scylladb/seastar/issues/1784, having flush retry `io_submit` in a loop might be futile if the kernel internal que... — committed to redpanda-data/seastar by bhalevy a year ago
@avikivity - here’s a more concrete repro to back the claim that N iocbs is not enough to support N sockets (nor N-C sockets for small C).
https://github.com/travisdowns/seastar/tree/td-eagain-repro
Run the
server.py
in the second commit like./server.py
, then to reproduce theclock::now() < retry_until
assert run:This tries to create 1000 sockets with
--max-networking-io-control-blocks
also set to 1000 and each socket is written to, read from and has a wait_for_input_shutdown callback attached.It fails after 2/3rds have been created. Each socket consumes 3 iocbs. The reason it fails after 2/3 and not 1/3 have been created is that the kernel somewhat arbitrarily multiples the number of supported iocbs you request in
io_setup
by 2. However, we should not rely on it and in any case 2 is still not enough as this test case shows.It also fails with smaller numbers, however because the kernel rounds up iocb counts to fill a full page there is a lot of quantization and it may pass for some values (e.g., with 10 sockets it will pass).
If you run it with DO_SLEEP=0 instead, it will reproduce the
last < end
crash after 1/3 of the sockets have been created, as we exceed the fixed-size array used to track enqueued iocbs. In our fork @StephanDollberg has also applied a fix for this issue: https://github.com/redpanda-data/seastar/pull/87/commits/d9175fcf84d31f21c0b72d0bea405f73ac1009b4 and that’s something we’d like to upstream too.cc @bhalevy
Will give that a try today and will also confirm we are indeed stalling on EAGAIN.