tokio-graceful-shutdown: Program hangs after successful shutdown

Hi there,

During some recent testing I ran into an “interesting” issue with SIGINT and similar shutdowns. The exact situation I identified is the following:

Deep inside one of my subsystems, I do an async lookup_host(), which has a timeout of apparently 20 seconds, which is await there. When hitting CTRL+C, all subsystems shut down and I get tokio_graceful_shutdown::toplevel] Shutdown successful..

At this point the program does not terminate though – it only exits after the lookup_host() future resolves/rejects. Do you happen to have an idea on how to force tokio to not wait for such things before shutting down itself?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 33 (20 by maintainers)

Most upvoted comments

Not really, to be honest. It’s different enough that it shouldn’t be part of this crate, in my opinion.

It’s just a few lines of code, so I don’t think it’s even worth abstracting it much. Rather the opposite: because runtime configuration is so highly context dependent, I think this is already the best and most generic API. Writing a macro would only make my crate more opinionated, and I’m trying to keep it as generic and unopinionated as possible.

Yah, it’s still open for debate. Now I just need two things:

  • tests
  • agreement with the tokio devs

The second one could take a while, tokio is notoriously understaffed with many open pull requests. They are doing an excellent job, just to clarify, but there’s just too many people wanting things done 😄

So be patient I guess

I’ve created an issue on the tokio repository. Maybe you can chime in there as well if you have anything to add ❤️

@DASPRiD Until I/we come up with a solution, you can work around it like this:

use std::{thread, time::Duration};

use env_logger::{Builder, Env};
use miette::Result;
use tokio::select;
use tokio_graceful_shutdown::{SubsystemHandle, Toplevel};

async fn hanging_task(subsys: SubsystemHandle) -> Result<()> {
    let hanging_task = tokio::task::spawn_blocking(|| {
        thread::sleep(Duration::from_secs(10));
    });

    select! {
        e = hanging_task => e.unwrap(),
        _ = subsys.on_shutdown_requested() => (),
    }

    Ok(())
}

#[tokio::main]
async fn main() -> Result<()> {
    // Init logging
    Builder::from_env(Env::default().default_filter_or("debug")).init();

    Toplevel::new()
        .catch_signals()
        .start("Hanging Task", hanging_task)
        .handle_shutdown_requests(Duration::from_millis(200))
        .await?;

    thread::spawn(|| {
        thread::sleep(Duration::from_millis(1000));
        log::error!("Shutdown seems to hang. Killing program ...");
        std::process::exit(1);
    });

    Ok(())
}

I could add something along those lines in handle_shutdown_requests:

use std::{thread, time::Duration};

use env_logger::{Builder, Env};
use miette::Result;
use tokio::select;
use tokio_graceful_shutdown::{SubsystemHandle, Toplevel};

async fn hanging_task(subsys: SubsystemHandle) -> Result<()> {
    let hanging_task = tokio::task::spawn_blocking(|| {
        thread::sleep(Duration::from_secs(10));
    });

    select! {
        e = hanging_task => e.unwrap(),
        _ = subsys.on_shutdown_requested() => {
            // Spawn thread that kills leftover tasks if necessary
            thread::spawn(|| {
                thread::sleep(Duration::from_millis(1000));
                println!("Shutdown seems to hang. Killing program ...");
                std::process::exit(1);
            });
        },
    }

    Ok(())
}

#[tokio::main]
async fn main() -> Result<()> {
    // Init logging
    Builder::from_env(Env::default().default_filter_or("debug")).init();

    Toplevel::new()
        .catch_signals()
        .start("Hanging Task", hanging_task)
        .handle_shutdown_requests(Duration::from_millis(200))
        .await
        .map_err(Into::into)
}

What do you think?

The idea is that we spawn a thread. If main() exits successfully, the thread is killed. But if main() hangs, the thread kills the program after a grace period.