runtime: `HttpClient.SendAsync(...)` may fail to return response that's returned by server prior to completion of sending full request.
Description
HttpClient.SendAsync(...)
, and by extension all derivative family of calls, incorrectly assume that the server can only issue response upon completion and full consumption of all of request’s content. While in most cases servers do, indeed, tend to issue response only after having fully read in and processed the request, this is not a requirement, and does not hold in all the cases; in some cases they can and do issue responses without having to fully consumed the request. In such latter cases, HttpClient
fails with quirky internal exceptions (e.g. socket closed exception with EPIPE
), and prevents calling code from examining the received response.
Details
The current case is with uploading a large (>100MB) content to AWS S3 bucket, using pre-signed URL via PUT
HTTP Verb.
AWS S3 pre-signed URLs are generated ahead of time, with various specifications of conditions about how it can be used, from permissions, to what headers must be specified when accessing it. If subsequent request to generated URL doesn’t match specifications stated during its creation, AWS S3 will reject the call with proper RESTful response.
AWS S3 in such cases does not read the full request before beginning its validation logic. It doesn’t need to because headers are sufficient to determine if the request is valid. If request is invalid, there is no point to waste time and bandwidth reading the content of PUT
operation, as it will be discarded and rejected. And in such cases where request is rejected after examining request headers, it will issue HTTP response stating the error and hard-close the connection.
When server hard-closes the connection while HttpClient
is writing request content, it encounters an error on underlying socket, fails to understand the nature of the error, and more importantly, it fails to attempt to read received response that is pending in network buffer to understand that there’s a valid response that it can process. E.g. it does:
System.Net.Http.HttpRequestException: Error while copying content to a stream.
---> System.IO.IOException: Unable to read data from the transport connection: Broken pipe.
---> System.Net.Sockets.SocketException (32): Broken pipe
--- End of inner exception stack trace ---
at System.Net.Security.SslStream.<WriteSingleChunk>g__CompleteAsync|210_1[TWriteAdapter](ValueTask writeTask, Byte[] bufferToReturn)
at System.Net.Security.SslStream.WriteAsyncChunked[TWriteAdapter](TWriteAdapter writeAdapter, ReadOnlyMemory`1 buffer)
at System.Net.Security.SslStream.WriteAsyncInternal[TWriteAdapter](TWriteAdapter writeAdapter, ReadOnlyMemory`1 buffer)
at System.Net.Http.HttpConnection.WriteAsync(ReadOnlyMemory`1 source)
at System.IO.Stream.CopyToAsyncInternal(Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
at System.Net.Http.HttpContent.CopyToAsyncCore(ValueTask copyTask)
--- End of inner exception stack trace ---
at System.Net.Http.HttpContent.CopyToAsyncCore(ValueTask copyTask)
at System.Net.Http.HttpConnection.SendRequestContentAsync(HttpRequestMessage request, HttpContentWriteStream stream, CancellationToken cancellationToken)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at MobileLabs.DeviceConnect.Framework.Http.HttpClient.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
This is incorrect behavior. When HttpClient
experiences request write error, it should assume that all is lost, and instead should still attempt to read and parse receive buffer to see if it contains a full and valid response. If it does, it should return it as if normal operation occurred without hiccups; if it doesn’t, then it should throw exception indicating abnormal operation during request write.
Suggested repro steps.
Since proper demonstration would require web server with specific behavior endpoint, the sample code below is limited to client only. You can use aforementioned AWS S3 for such setup and craft proper pre-signed URL which would fail to accept upload request, or any other endpoint that would read headers but not content, and issue proper and valid response rejecting the request (and close the socket prior to full upload completion).
The following client code was used that resulted in the above exception:
var request = new HttpRequestMessage(HttpMethod.Put, uploadUri)
{
Content = new StreamContent(largeVideoStreamOfHundredsMegabytes)
};
request.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("video/mp4");
var uploadTimeout = TimeSpan.FromMinutes(Timeout);
using var uploadTimeoutToken = new CancellationTokenSource(uploadTimeout);
using var response = await Framework.Http.HttpClient.Instance.SendAsync(request, uploadTimeoutToken.Token);
Configuration
.NET Core 3.x (I don’t think it really matters)
macOS 10.15.6 x86
Probably not specific to any configuration or platform. It’s how HttpClient
is coded.
Regression?
Nah. Rather, it’s an API design issue that doesn’t properly foresee such scenario.
Other information
Problem, as hinted above, is API assuming that request will be read fully before issuing response. Server can issue response without fully reading request, or even. issue and stream response as it is reading request (e.g. media conversion type endpoint that converts media as it reads and produces response output, without storing interim - i.e. converts on-the-fly and writes result on-the-fly back to client.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 18 (11 by maintainers)
That’s interesting. This seems to be a platform difference between Linux and Windows. I ran my sample code from above on Linux, and it does indeed receive all sent data before throwing the connection reset exception.
That said, my general point here stands: Once you send a RST you cannot rely on the peer receiving any data you sent before the RST. That’s certainly true on Windows, as shown above, and may be true on other platforms as well. It may even be true on Linux in certain situations.
Yes, but that’s another good reason to avoid sending a RST in this case. The sending application does not know whether the data that it has sent (i.e. placed into the send buffer) has actually been sent and acked by the peer or not. So you cannot safely send a RST, because there could always be unacked data in your own send buffer that would be blown away.
In fact your SLEEP_A_BIT code above demonstrates this: it’s only safe to send the RST after a nontrivial delay, because otherwise you might blow away your own send buffer. In your example, the 1 second delay is enough; but in general, over non-loopback, you can’t really know when it’s okay to actually send the RST.
There are good reasons to drop the receive buffer on receiving RST: it allows you to notify the application promptly that the connection has been reset. For example, imagine the server starts sending a large response, and then in the middle of sending the response, a fatal error occurs and the connection is reset. The client is processing data as it comes in, but it may have a large backlog of data in its receive buffer that hasn’t been delivered to the application yet. If it continues to deliver this data to the application even after receiving the RST, then the application needs to process all this data before it will be notified of the RST, which is probably just wasted time and CPU. Delivering the RST immediately avoids this problem.
(And because of that, I’m a bit surprised that Linux behaves the way it does…)
@geoffkizer I think you are missing a very important point here: When you make a low level
send()
call and this call returns successfully, this does not mean that any data has been sent yet. It only means that the data has been accepted, placed into the send buffer of the socket and the kernel will send out the data on the next occasion possible and as fast as the network allows it. If you make a socket sendRST
to the other side, then the local socket must drop all data still in the send buffer, as that is mandated by the RFC. However the RFC nowhere says that data at the receive buffer on the other side has to be dropped. This also would make little sense, as this data has already been confirmed as being correctly received; but being correctly received doesn’t mean the application had any chance to see that data yet, it only means that data was placed in the receive buffer. A sender does not expect that data which has already been acknowledged by the other side as being correctly received to suddenly become “not correctly received” when it issues a TCPRST
.Let me demonstrate that with a very simple sample program. Unfortunately I don’t know C# or .NET, so please forgive me for just using plain old C code instead. I will include the full source code below and link to an online resource, where you can directly test that code in your browser.
Let’s just quickly explain what the program does:
The program creates a local TCP server socket, connects a local client socket to it, writes 200 integer numbers from client to server, closes the client socket, and finally tries to read as much data as possible from the server socket, so we can see what happens.
There are three interesting pre-processor defines on top:
HAVE_SIN_LEN
- This is just to make the code compatible with (Free-)BSD and macOS/iOS. On those systems the define must be set to 1, whereas on Linux it must be set to 0. On Windows… I have no idea but you probably know; also I’m not sure how much of that code is going to compile on Windows in the first place.CLOSE_WITH_RST
- If set to 1, the client will ensure the socket is closed by aRST
. If set to 0, the client will close the socket gracefully instead, so it will be closed by aFIN
and then linger. It would only close byRST
after the linger time has passed but the program won’t run long enough for that to happen.SLEEP_A_BIT
- If set to 1, the client will wait an entire second before closing the socket after sending the data. If set to 0, it will close the socket as fast as possible. This holds true no matter if the socket is closed by RST or by FIN. This just adds some delay to give the kernel enough time to perform the actual send operations required to transfer all the data in the send buffer and yes, this does make a difference!Now here’s the output if I disable
CLOSE_WITH_RST
, no matter if sleeping or not:As you can see, I get all 200 numbers and the stream closes normally. But what happens if I enable
CLOSE_WITH_RST
and keepSLEEP_A_BIT
disabled? Well, the result will vary.It can be like this (up to 159):
Or it can be like this (up to 47):
But what happens if I enable
SLEEP_A_BIT
? Well, then the result will always go up to 199 before I run into an error:And that’s because of what I explained at the very top: The RFC demands that after sending a
RST
, no further data must be sent to the peer, hence all unsent data in the local send buffer will be lost. But no acknowledged data in the receive buffer of the peer will be lost.Try running that code online: https://onlinegdb.com/8ToVOsWSP It always fails at 16 for me because there wasn’t enough time to transfer more than the first 16 integers before the socket is closed forcefully.
Now let’s enabled sleeping: https://onlinegdb.com/aGMZcpt0A See, it runs through all the way to 199. There’s still an error in the end as receiving a
RST
is an erroneous situation the other side should be aware of, but already sent data won’t be lost because of aRST
.You may dispute that setting Linger time to zero will force a
RST
but it’s running the code locally it’s easy to prove that this is the case, just keep this open in a terminal windowsudo tcpdump -i lo0 tcp
With
CLOSE_WITH_RST
enabled:With
CLOSE_WITH_RST
disabled:R
is theRST
flag andF
is theFIN
flag.Finally, let me append the entire source code below, just in case, as I don’t know how long the online service referenced above will keep that code available or how long that service is going to exist. So in case it goes down, here’s the code once again:
Unfortunately, it won’t help you here, this is the place where it gets set up: https://source.dot.net/#System.Net.Http/System/Net/Http/HttpResponseMessage.cs,172. Meaning only if everything proceeded without an issue and you got non-success status code. Since the server is closing TCP connection on you, the status code won’t be set.
I diged a bit into AWS docs (please correct me if I’m looking at the wrong docs), but it seems that they do support 100-continue: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOST.html:
It seems to me like it’s exactly made for your case.