runtime: HttpClient Response Stream hangs

There have already been two similar requests #36822 and #11594. In #36822 there is even an excellent repro https://github.com/dotnet/runtime/issues/36822#issuecomment-632821881.

I discovered the issue in PowerShell:

  1. Start download large file
Invoke-WebRequest https://github.com/PowerShell/PowerShell/releases/download/v7.3.0-preview.7/PowerShell-7.3.0-preview.7-win-x64.zip -OutFile C:\temp\PowerShell-7.3.0-preview.7-win-x64.zip
  1. Emulate network disaster by means of two options: 2.1 Close Wi-Fi connection on notebook 2.2 Disconnect Wi-Fi router from Internet
  2. PowerShell cmdlet hangs on CopyToAsync()
  • for option 2.1 until Wi-Fi recovered (if network was locked over ~1 min download is interrupted)
  • for option 2.2 infinitly even after the Internet connection is restored

It is not a problem for interactive scenario since user can press Ctrl-C but it is an unpleasant problem for script and (PowerShell SDK custom) host scenarios. Since it does not leave a trace in the logs and is eliminated by rebooting, the technical support concludes that it is a hardware problem, which is a false path. PowerShell user also does not get a message that the file is not fully downloaded.

Note that it is the web client that we are considering. Why is it important to have a timeout?

The infinite timeout of sockets is a fundamental constant and this will never be changed. This is useful for low-level connections which can re-establish the connection without any extra effort. But how do web services behave? In fact they close the connection if the client doesn’t send requests for some time so as not to waste resources. This timeout is quite short. But if any web server closes a session by a short timeout, then it doesn’t make sense for any web client to have an infinite timeout on socket level - reconnection on the socket level doesn’t work. On the contrary, the client must receive a timeout signal to either end the session on his side or try to resume it at a higher level. This makes us think that any web client should have a default timeout greater than the typical timeout of any web server. If there is any doubt that there should be a default timeout, then at least the developer should be able to set it.


In which APIs is it better to add this timeout? I don’t have the exact answer.

Obviously the timeout must be in SocketsHttpHandler.

But this is not enough since this class is sealed. Thus PowerShell historically uses HttpClientHandler which is a SocketsHttpHandler wrapper. We could migrate PowerShell to SocketsHttpHandler but HttpClientHandler uses some internal helper classes to map to SocketsHttpHandler. So perhaps it makes sense to add ReadWriteSocketTimeout to HttpClientHandler too (or make the helper classes public).

Note that the WebRequest API supports this timeout (although it doesn’t seem to be public either). https://github.com/dotnet/runtime/blob/e71a9583b4d6c9bd97edd87cda7f98f232f63530/src/libraries/System.Net.Requests/src/System/Net/HttpWebRequest.cs#L1666-L1702

This uses a callback, which is not convenient. In addition, it duplicates the standard code from the SocketsHttpHandler, which can be changed in the future but will not be automatically inherited by the application.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 46 (46 by maintainers)

Most upvoted comments

Triage:

  • We agree that Stream.ReadTimeout and Stream.WriteTimeout should NOT apply to async operations.
  • We agree HttpClient.Timeout should NOT extend to stream read/write operations. It would break WebSockets, YARP and other customers.
  • We agree that handling these timeouts only via cancellation is less than ideal and perhaps we should have something more user friendly exposed. E.g. something like HttpClient.ReadWriteTimeout or HttpResponseStream.ReadWriteTimeout (though we might not want to do that one) … which may very well lead back to 1st case above (Stream.ReadTimeout or Stream.WriteTimeout).

@iSazonov is there anything else that you think needs attention / response from triage?

Given that there is way too long discussion here in many angles, we should fork the 3rd case above into separate issue and discuss solution there. Then close this issue. All assuming we didn’t miss something else in the discussion.

@stephentoub Many thanks for your help! I get better understanding how this work internally - great experience for me.

In experiment I do the cancellation token is not involved.

It should be. If PowerShell wants to time out the operation, it has the ability to do so via the token.