sled: Cannot shutdown sled from Windows Service

  1. sled version 0.28.0
  2. rustc version 1.39.0
  3. operating system Windows 10 Enterprise 1809 Build 17763.864
  4. logs

When I try to join my main thread after receiving the stop signal, I have 2 sled threads that are not terminating.

The Db and Trees are stored using RwLock:

pub struct Table {
    pub arc_tree_lock: RwLock<Option<Arc<Tree>>>,
    name: String,
    key: String,
    key2_opt: Option<String>,
}

pub struct MyDb {
    pub my_db: Arc<RwLock<Option<Db>>>,
    db_name: String,

When I get the Windows service stop, for each Tree I call flush() and then I set the option to None to de-reference the Tree after taking the lock.

When done, I set my_db to None to de-reference the DB.

However, I have 2 threads still waiting that prevent the final join from completing, it was 3 including the log flusher, but after the code to flush then set option to None, at least the log flusher seems to have cleaned up.

The 2 are sled-io with the following stack trace (after break all):

 	[External Code]	
 	[Inline Frame] my.exe!!std::sys::windows::c::SleepConditionVariableSRW() Line 70	Unknown
 	[Inline Frame] my.exe!!std::sys::windows::condvar::Condvar::wait() Line 22	Unknown
 	[Inline Frame] my.exe!!std::sys_common::condvar::Condvar::wait() Line 41	Unknown
 	[Inline Frame] my.exe!!std::sync::condvar::Condvar::wait() Line 204	Unknown
 	my.exe!std::thread::park() Line 911	Unknown
 	my.exe!!crossbeam_channel::context::Context::wait_until(core::option::Option<std::time::Instant> self) Line 175	Unknown
 	my.exe!!crossbeam_channel::flavors::zero::{{impl}}::recv::{{closure}}<alloc::boxed::Box<FnOnce<()>>>(crossbeam_channel::flavors::zero::{{impl}}::recv::closure-1 cx, crossbeam_channel::context::Context *) Line 304	Unknown
 	my.exe!!crossbeam_channel::context::{{impl}}::with::{{closure}}<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-0 * cx, crossbeam_channel::context::Context *) Line 51	Unknown
 	my.exe!!crossbeam_channel::context::{{impl}}::with::{{closure}}<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-1 cell, core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>> *) Line 58	Unknown
 	my.exe!!std::thread::local::LocalKey<core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>>>::try_with<core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>>,closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-1 self) Line 262	Unknown
>	my.exe!!crossbeam_channel::context::Context::with<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::flavors::zero::{{impl}}::recv::closure-1 f) Line 53	Unknown
 	my.exe!!crossbeam_channel::flavors::zero::Channel<alloc::boxed::Box<FnOnce<()>>>::recv<alloc::boxed::Box<FnOnce<()>>>(core::option::Option<std::time::Instant> self) Line 293	Unknown
 	my.exe!!crossbeam_channel::channel::Receiver<alloc::boxed::Box<FnOnce<()>>>::recv<alloc::boxed::Box<FnOnce<()>>>() Line 718	Unknown
 	my.exe!!crossbeam_channel::channel::{{impl}}::next<alloc::boxed::Box<FnOnce<()>>>(crossbeam_channel::channel::Iter<alloc::boxed::Box<FnOnce<()>>> * self) Line 1105	Unknown
 	my.exe!!pagecache::threadpool::spawn::init_pool::{{closure}}(pagecache::threadpool::spawn::init_pool::closure-0) Line 54	Unknown
 	my.exe!!std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0,()>(pagecache::threadpool::spawn::init_pool::closure-0 f) Line 127	Unknown
 	my.exe!!std::thread::{{impl}}::spawn_unchecked::{{closure}}::{{closure}}<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::{{closure}}::closure-0) Line 471	Unknown
 	my.exe!!std::panic::{{impl}}::call_once<(),closure-0>(std::panic::AssertUnwindSafe<closure-0> self) Line 316	Unknown
 	my.exe!!std::panicking::try::do_call<std::panic::AssertUnwindSafe<closure-0>,()>(unsigned char * data) Line 292	Unknown
 	my.exe!!panic_unwind::__rust_maybe_catch_panic() Line 80	Unknown
 	my.exe!!std::panicking::try<(),std::panic::AssertUnwindSafe<closure-0>>(std::panic::AssertUnwindSafe<closure-0> f) Line 271	Unknown
 	my.exe!!std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure-0>,()>(std::panic::AssertUnwindSafe<closure-0> f) Line 394	Unknown
 	my.exe!!std::thread::{{impl}}::spawn_unchecked::{{closure}}<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::closure-0) Line 469	Unknown
 	my.exe!!core::ops::function::FnOnce::call_once<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::closure-0 *) Line 227	Unknown
 	my.exe!!alloc::boxed::{{impl}}::call_once<(),FnOnce<()>>() Line 922	Unknown
 	[Inline Frame] my.exe!!alloc::boxed::{{impl}}::call_once() Line 922	Unknown
 	[Inline Frame] my.exe!!std::sys_common::thread::start_thread() Line 13	Unknown
 	my.exe!e!std::sys::windows::thread::{{impl}}::new::thread_start() Line 47	Unknown
 	[External Code]	

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (2 by maintainers)

Most upvoted comments

Joining a thread isn’t transitive, so if you’re just joining this one thread which actix or actix-web is running on, sled’s threadpool shouldn’t have anything to do with it. There is likely an issue in how you are stopping the actix or actix-web runtime, unrelated to sled.

There have been multiple bugfixes since 0.28, and there are known bugs in the issue tracker presently.

You mentioned in your initial posts that you were joining threads, and join was not returning. Are you modifying sled to expose the JoinHandles of its thread pool/flusher threads, or are you doing something else with Windows APIs?

I did some searching, and I haven’t found any solid requirement that Windows services stop all their threads upon receiving a stop service control event. From what I’ve read, it sounds like it’s fine for the process to exit after the whole service lifecycle is done and StartServiceCtrlDispatcher has returned to your entry point function. Once all Db and Tree objects have been dropped, there should be no more work for the thread pool to do, so the sled-io threads will be waiting indefinitely. As such, it’s fine for them to be automatically terminated as the process exits.