runtime: BackgroundService handles Exceptions getting thrown differently when tasks are not faulted vs faulted

I think how exceptions are handled with BackgroundService class is a bit confusing and may require some discussion.

If an exception is thrown during startup (e.g. in Main) and before ExecuteAsync is handled, the app will crash.
If an exception is thrown during ExecuteAsync but before the first await is called, the app will crash.
If an exception is thrown during ExecuteAsync but during or after the first await is called, the app will … semi-hang? It stop processing … but … Ctrl-C does successfully start the app gracefully terminating.

Now - I hope i have the above steps/scenario’s right – I might have fudged something (async/await/ task contexts, etc are my weak points).

So the problem here is about the inconsistency about what an exception should be doing in a BackgroundService. Should it crash the host (assuming no top level try/catch in Main)? or get swallowed? Maybe the issue is really about documentation (as the code is totally fine).

This issue arose when I started a conversation in the “ASP.NET slack” group when I was trying to handle an exception getting thrown in side the ExecuteAsync method and bubbling that upwards.

To quote some of the summaries of this convo:

@davidfowl [7:06 PM] great discussion heh it could suppress both and if you wanted to get exceptions, then you override StartAsync and put logic there

–

@chrisvanderpennen [7:13 PM] theres imo three scenarios that need covering: i don’t want any exceptions in this to stop my host i want an exception during init to stop the host startup, but runtime exceptions are ok i want a runtime exception to stop my host

@davidfowl [7:30 PM] somebody file a bug

so CC’ing @poke @gulbanana @chrisvanderpennen and @davidfowl who were in the convo.

I think the issue subject might need to be refactored.
happy to edit the main body of this issue, where i got my facts wrong.
This issue needs the famous tag ‘fowler’ (fowler has issues). 😃

edit: this was a sample gist I tried to make to also test/repo this problem.

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 18
Comments: 28 (12 by maintainers)

Commits related to this issue

Merge pull request #813 from benaadams/threading-issue Move AddProviderRegistration under lock — committed to dotnet-maestro-bot/Common by pakrym 6 years ago

Most upvoted comments

Let me try to paint the worst case scenario for each “mode”. Scenario 1. You have a non-important Service and the default is to crash on unhandled errors This brings your application down. You go look at the logs, and you see that it’s your NotImportantService that brought the house down. You curse under your breath, and wrap it in a dummy catch or a Polly retry or whatever. Worst case is you get some downtime once.

Scenario 2. You have an important service, and the default is to just let it die Your service runs some stuff every hour. At some point data stops coming through. You look at the server - it’s running fine (and you don’t know that a crashing BackgroundService doesn’t crash the server). You look at the logs. Finding anything is hard (impossible without this new PR), because the service could have crashed a long time ago. You mutter under your breath, and figure out how to properly close the application via IApplicationLifetime or similar. I’m not exaggerating when I’ve seen people spend weeks re-writing stuff because they were unable to debug a silently crashing BackgroundService, and thought the process was gettign OOMKilled or similar.

All I’m saying is that the failure mode for Scenario 2 is much worse, which is why I think that crashing should be the default. Admittedly that calculation was more heavily skewed back when we didn’t even have any logs.

But perhaps the right solution is for there to be no default - but forcing the user to explicitly specify “Should this backgroundservice crash the server on an uncaught error” - because I agree that there are both types of services.

+12

GeeWee on Oct 13, 2020

Correct. Unfortunately there was higher priority work that we had to bump this out for.

analogrelay on Jul 31, 2019

Yes, I have a branch, I’ve just been lazy finishing it. I’ll do it next week.

davidfowl on Mar 30, 2019

I couldn’t figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

Dotnet-GitSync-Bot on May 13, 2020

I just want to do a shameless plug that I’ve implemented a tiny library BetterHostedServices that lets you inherit from CriticalBackgroundService which will end the application on an uncaught exception. Hopefully this issue will get fixed in the core library but until then this is a band-aid solution.

GeeWee on Jan 19, 2021

anurse was very adamant that a BackgroundService throwing an exception shouldn’t bring down the application. See the thread here:

https://github.com/dotnet/runtime/issues/36017#issuecomment-562710058

In my opinion, this issue is to make these two scenarios consistent:

If an exception is thrown during ExecuteAsync but before the first await is called
If an exception is thrown during ExecuteAsync but during or after the first await is called

Since the 2nd case shouldn’t bring down the application, neither should the first case.

eerhardt on Oct 9, 2020

A few targeted changes to BackgroundService.

We Task.Run the call to ExecuteAsync because user code cannot be trusted
Exceptions thrown in the background service are observed in StopAsync but never in StartAsync.

davidfowl on Jan 10, 2019

I have just been hit by this along with several colleagues. We have a long running background service pulling from a message queue (Google PubSub). The message queue service is started (await StartAsync) and completes only when manually stopped, or when an unrecovereable error occurs. This recently started periodically failing silently for long periods of time until a restart. And it took us a long time to figure out was a permission issue, even though an exception was thrown by StartAsync.

Our assumptions mirrors @GeeWee’s thoughts, that an unhandled exception is exceptional, and will result in the application exiting. This has been my logic for many years, stemming from multiple sources (some listed below). If you need something to be resilient against exceptions, then it is an active choice you make, and wrap it in a try…catch. Having code swallow exceptions that are too vague is something that I point out again and again in code reviews, since there is a risk of hiding unknown future errors, and since I expect the runtime to catch any unhandled exceptions.

BackgroundService seems to flip this upside down, and now we have to actively handle unknown exceptions in every BackgroundService we implement.

If no catch block is found, then the CLR displays an unhandled exception message to the user and stops execution of the program. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/try-catch

If no exception handler for a given exception is present, the program stops executing with an error message. https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/exceptions/

When your code cannot recover from an exception, don’t catch that exception. Enable methods further up the call stack to recover if possible. https://docs.microsoft.com/en-us/dotnet/standard/exceptions/best-practices-for-exceptions

The exception is passed up the stack until the application handles it or the program terminates. https://docs.microsoft.com/en-us/dotnet/standard/exceptions/

Rassi on Jan 18, 2021

Right, I think it’s fine to suggest an option on HostOptions that would fail if any background service task throws but it wouldn’t be the default.

davidfowl on Oct 9, 2020

PR https://github.com/dotnet/runtime/pull/42981, submitted for 6.0, changes this scenario a bit.

If an exception is thrown during startup (e.g. in Main) and before ExecuteAsync is handled, the app will crash.

This behavior has not changed.

If an exception is thrown during ExecuteAsync but before the first await is called, the app will crash.

This behavior has not changed.

But for the below scenario:

If an exception is thrown during ExecuteAsync but during or after the first await is called, the app will … semi-hang? It stop processing … but … Ctrl-C does successfully start the app gracefully terminating.

(directly related to: https://github.com/dotnet/runtime/issues/36017) when this happens we log the exception

summary

The main discussion around this issue is around the fact that if we throw an exception before await vs after an await the app behaves differently.

maryamariyan on Oct 8, 2020

I would have expected worker services to have a built-in notion of global exception handling, but I guess not (yet)?

Anyway, here’s what I do for now:

protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
    while (!stoppingToken.IsCancellationRequested)
    {
        try
        {
            _logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);

            // actual stuff happens here
        }
        catch (Exception ex)
        {
            // waiting on https://github.com/aspnet/Extensions/issues/813
            _Logger.LogError(ex, "Global exception occurred. Will resume in a moment.");
        }

        await Task.Delay(1000, stoppingToken);
    }
}

chucker on Dec 18, 2019