runtime: Blocking Task.Result design flaw leads to deadlock when called from ThreadPool thread

Run this application:

https://gist.github.com/AqlaSolutions/f8c993bfad32dc9be11703fea4e01422

It simulates network engine receiving requests and calling synchronously IConnection.OnRequestReceived. This method calls async method and then synchronously processes the result retrieved with Task.Result. The async method may take up to 700 ms.

Assume I can’t use await, because external network engine provides synchronous interface and I need to check async call result before returning.

During run the simulation will decrease time between requests until deadlock occurs - after that point currently awaiting requests will never be completed.

I set TP to low threads number to show the problem. It will also happen on 200 and 1000 threads but it will just take more time and will require modifying the simulation code for more requests. In a real app you will never want to set max threads to big numbers because it will make your app very slow (context switching thing) and also each thread takes 1 MB RAM.

This example proves that absolutely any blocking call to Task.Result or Task.Wait from ThreadPool thread will work for some time but will eventually deadlock your application in truly high load scenario. Using these methods from ThreadPool thread for waiting is always a mistake and should be disallowed (but I understand that it will not happen for compatibility reasons).

Yes, we all know from docs that mixing async and blocking “may cause deadlocks” but previously I thought about this only in a way “do not block with .Result thread which synchronization context may be required to complete the waited task itself”. TP deadlock is not obvious and many people will “just use .Result” without ever encountering it during testing.

What do you think? May be some workarounds inside .NET TP? Prioritize continuations over new tasks?

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 2
  • Comments: 17 (9 by maintainers)

Most upvoted comments

VS has similar problems which is why they are pumping messages when doings waits

VS has other issues, such as needing to deal with multiple UI models and COM, and a single UI thread.

await and async have changed the systems enough that deadlocks happen more frequently

Ok. And, what is the desired change here? My response is the same: don’t use blocking APIs if you don’t want to block. I’ve already outlined some reasons why general pumping during Task.Wait()/Result would be a bad thing to do. Unless there’s a concrete proposal here for something to change, I’m not seeing the action item.

I don’t see how this is a flaw in Task.Result or how could it be fixed by the framework.

+1

But the reality is that you should never wait with .Result from TP threads

This has little to do with .Result… it’s simply about blocking in general. You’re right that one should avoid blocking as much as possible, especially on thread pool threads, but that doesn’t mean that it’s a design flaw to having APIs that block, rather it means one should try to avoid/minimize the usage of APIs that block in places where it’s undesirable. For example, it’s not a design flaw that ManualResetEventSlim.Wait() blocks, but it may be a design flaw of an app if it uses that method too much, in places it shouldn’t, etc.

Task.Result/Wait already do try to do a bit of additional smarts to alleviate some of the issue here: if you wait on a task backed by a delegate (e.g. Task.Run(someDelegate)), and that task hasn’t started executing yet, then the wait operation will ask the scheduler if it’s ok for the wait to “inline” the execution, i.e. rather than simply blocking and waiting for some other thread to come along and execute the task, since this thread is going to be blocked anyway, it may as well run the task rather than blocking waiting for it. But that only helps in some situations, and it doesn’t help if there’s nothing to actually run, e.g. the task returned from an async method is just a promise… there’s nothing associated with it to actually “run”, so waiting on it will block. We explicitly chose not to pump arbitrary work items in this case, in part as doing so could cause the wait to take much longer than it otherwise would have (the pumped work item may be completely unrelated).

@Clockwork-Muse ,

  1. deadlock and slowing down are different things. Deadlock leads to a scenario where even when requests are not coming anymore the server is still unable to do anything because all ThreadPool threads are blocked in infinite waiting.
  2. IO tasks perform callback through ThreadPool too but they are not affected by this deadlock so much because ThreadPool has much more dedicated IO slots than for normal cpu tasks.
  3. I’m not asking to raise thread priority, I’m asking to raise action priority in ThreadPool queue. Currently ThreadPool queue has no guaranteed ordering so adding some priority rules doesn’t affect compatibility.

I know this is some 3 years later, and this is closed, but: AqlaSolutions is completely correct. It is very error-prone these days when trying to adopt the TPL. Using .Result is for most people OK to use, not understanding that it will eventually break your application.

That this design passed through the .NET team is beyond me…

For context the author was having issues with NHibernate

NHibernate authors already stated that they are not going to support async and that it shouldn’t affect performance when compared to app logic. This is an example how “good” people usually understand TP. Do you still think they won’t use .Result in TP threads?

Anyway NHibernate is a very big project and I don’t believe that it can be easily migrated to async.

So for my project I will create async wrappers for the most blocking thing in NHibernate - getting connection from pool - which will move blocking calls to a separate custom TP and free the main TP for continuations.

@AqlaSolutions a non-blocking use of .Result (or .GetAwaiter().GetResult() which is similar but doesn’t give aggregate exceptions for single tasks) would be:

if (task.IsCompleted)
{
    if (task.RanToCompletion)
    {
         var result = task.GetAwaiter().GetResult();
    }
    else 
    {
         // error or cancelled
    }
}
else
{
    // not finished yet
}

@stephentoub, ManualResetEventSlim.Wait has it’s specific application area and in your example it’s applied incorrectly. But blocking calls to Task.Result are expected to be applicable in usual situations. Now considering the deadlock they are not applicable in area where they are expected to work (may be not efficiently, but at least stable!).

Developer expects blocking to perform slower but not to disable the whole app. The different is between “will be not efficient” and “block app from doing anything until restart”. The first is acceptable and expected but the second is what can actually happen. It’s not about choice between fast and slow but instead between working and hanging.

If you have a method which can’t be used in production because each time it’s called it has a chance to block the whole app thus making it unstable - it’s a design flaw.

@svick, again, people usually think that blocking with .Result is just not recommended but it won’t break an app. Recently I saw a comment on SO:

You don’t have to make the calling methods async - they can just do the call to the async method as var result = SomeAsyncMethod().Result. In this way, you can break the “async chain” until you are ready to refactor further up the call tree.

But the reality is that you should never wait with .Result from TP threads - if one request can do it - there is a chance to have 100 requests waiting like this at same time (but may be in different code places) and deadlock. This problem is not obvious and not documented enough.

On the TaskScheduler - I thought it affects only ContinueWith behavior. Can you describe more how using custom TaskScheduler should be done?