runtime: HTTP2: Flakyness when server doesn't completely read request
I have created a repro for an error that sometimes shows up on the grpc-dotnet CI server.
Repro:
git clone https://github.com/JamesNK/grpc-dotnet.git
git checkout jamesnk/stress-maxsize
dotnet test test\FunctionalTests --filter Name~ReceivedMessageExceedsSize_ThrowError
Logic in the test is like:
- Client makes a non-streaming call to the server and sends a large request body
- Server ends response without reading complete request body
- Client asserts response
- Go to 1.
When one thread is executing this will not error.
When multiple threads share one HttpClient it consistently throws this error:
System.Threading.Tasks.TaskCanceledException : A task was canceled.
Stack Trace:
at TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
at Http2Connection.SendStreamDataAsync(Int32 streamId, ReadOnlyMemory`1 buffer, CancellationToken cancellationToken)
at Http2Stream.SendDataAsync(ReadOnlyMemory`1 buffer, CancellationToken cancellationToken)
at HttpContent.CopyToAsyncCore(ValueTask copyTask)
at Http2Stream.SendRequestBodyAsync(CancellationToken cancellationToken)
at Http2Connection.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at TaskExtensions.TimeoutAfter[T](Task`1 task, TimeSpan timeout, String filePath, Int32 lineNumber) in TaskExtensions.cs line: 58
at <<ReceivedMessageExceedsSize_ThrowError>b__1>d.MoveNext() in MaxMessageSizeTests.cs line: 77
at --- End of stack trace from previous location where exception was thrown ---
at MaxMessageSizeTests.ReceivedMessageExceedsSize_ThrowError() in MaxMessageSizeTests.cs line: 87
at GenericAdapter`1.BlockUntilCompleted()
at NoMessagePumpStrategy.WaitForCompletion(AwaitAdapter awaitable)
at AsyncToSyncAdapter.Await(Func`1 invoke)
at TestMethodCommand.RunTestMethod(TestExecutionContext context)
at TestMethodCommand.Execute(TestExecutionContext context)
at <>c__DisplayClass1_0.<Execute>b__0()
at BeforeAndAfterTestCommand.RunTestMethodInThreadAbortSafeZone(TestExecutionContext context, Action action)
.NET Core SDK (reflecting any global.json):
Version: 3.0.100-preview9-013697
Commit: baac5f4d5c
// @geoffkizer
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 37 (37 by maintainers)
I don’t view it urgent unless this blocks some gRPC scenario. I’m going to move this to 5.0 for now unless somebody objects.
@wfurt Nice work tracking this down. There doesn’t appear to be a client issue here, so I’m closing.
We definitely could make the half-closed stream tracking more efficient, but I don’t think that gets us much. Even if we made the limit 10x the max stream count instead of 2x, the client would still only be able to send ~200 requests per second. That may be better than 20 RPS, but not by much. Of course, if the client were to respond to the server RST with an RST/EOS of its own before opening new streams, there wouldn’t be this seemingly arbitrary RPS limit.
The only way to get a reasonable RPS without tracking thousands or even millions of half-closed streams (given the 5s grace period) is to have the client finish closing the streams before opening new ones.
@stephentoub / @geoffkizer Will one of you be able to look at this bug?
The latest 3.0 SDK has https://github.com/dotnet/corefx/pull/40112. With it I still get this error on my machine.
I nuked a bunch of SDK related stuff to try to ensure it’s running the expected bits, and now I can repro this as well on current release/3.0 bits.
Also, apparently this SDK contains a System.Net.Http.dll from the future. Pretty cool!
08/09/2019 07:19 PM 1,475,448 System.Net.Http.dll