runtime: Reduce the impact of blocking DNS calls on Unix

After helping a customer look at a thread pool starvation case on linux on .NET Core 3.1 I ended up here. After doing some research and with some discussion on Twitter, it turns out that getaddrinfo_a uses an internal thread pool and blocks on getaddrinfo and isn’t doing any async IO. This change is an improvement over what we had before because our threadpool doesn’t grow but I’m not sure this change is a net positive in the long run. The thread pool limits are controlled by compile time constants in glibc (essentially, another library is doing async over sync for us on a less controllable threadpool…).

I wonder if we’re better off controlling this blocking code and maybe it should be possible to turn this off with a configuration switch.

The other improvement I was thinking about was only allowing one pending request to a specific host name concurrently. That would improve situations where DNS is slow and new blocking calls are issued for the same host name (which is the case the customer ran into) on thread pool threads.

cc @geoffkizer @stephentoub @scalablecory

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 54 (54 by maintainers)

Most upvoted comments

OK so summary:

  • Revert the current async change
  • Do #19443 in runtimelab
  • Discuss options to reduce the impact of blocking DNS calls to the same host name (e.g coalesce calls to the same host name)

They handle this case differently to the general case of blocking.

As I said above, I’m happy to throttle/synchronize/whatever specific kinds of blocking calls when it makes sense because we have knowledge of what the calls are doing and how they are likely to block, etc. The per-hostname serialization here is a good example of it.

But I don’t think this addresses the general problem.

we should run these on a different set of threads.

Why different threads? Threads are threads. What I think matters is whether any given thread is blocking at the moment, or not.

Triage: Scoped down to the third bullet:

  • Discuss options to reduce the impact of blocking DNS calls to the same host name (e.g coalesce calls to the same host name)