runtime: HttpClient Http/2 doesn't flush frames when hitting the window limit

I have come across a strange issue when sending json log events using Serilog using the HTTP sink, but the issue will only appear if the receiving ASP.NET API is behind an F5 load balancer. The issue is that not all the data will be sent to the VIP on the F5 and the sending will hang until a timeout is reached (be that in IIS, httpClient timeout or F5 timeout).

I was able to get some POST’s request to work by modifying the HTTP/2 profile on the F5 (https://support.f5.com/csp/article/K04412053), altering the Receive Window setting from the default 32kb, they only support a maximum of 128KB in this setting. That will work if I am sending say 50 events, but the default for the sink and what we would like to use is 1000 events. Due to the unknown size of what a list of events may be, I can not guarantee adjusting this receive window size would be a reasonable fix.

In summary:-

Sending API ->(HTTP/2)-> Receiving API - Works OK Sending API ->(HTTP/1.1)-> Receiving API - Works OK

Sending API ->(HTTP/2)-> F5 VIP --> F5 Server Side ->(HTTP/2)-> Receiving API - Does Not Work Sending API ->(HTTP/1.1)-> F5 VIP --> F5 Server Side ->(HTTP/2)-> Receiving API - Works OK

I have raised a case with F5 and they seem reluctant to investigate as they have used nghttp and are able to easily send 1000 events using HTTP/2 and indeed I can use curl to do the same, so they are saying this is pointing to aspnet rather than F5.

Example curl command that works fine curl -vkX POST -H "Content-Type: application/json" -d @Events100.json https://IPADDRESS_HERE --http2

The original setup is using aspnetcore 3.1 web api that logs using serilog with the http sink to another aspnetcore 3.1 web api that will receive these serilog events and store them in a database.

For further testing I have created a simple dotnetcore 3.1 console application that will read a file (which just contains the events serilog would have been sending) and uses a httpClient to send to a basic aspnetcore 3.1 web api that will receive the data (not using IIS but just plain Kestrel) and write out to a file, this testing also fails with the timeout scenario.

This is all running on Windows Server 2019 Core.

For reference the events I am talking about are like this:-

{
  "events": [
    {
      "Timestamp": "2021-05-05T10:25:32.5505666+00:00",
      "Level": "Information",
      "MessageTemplate": "HTTP {RequestMethod} {RequestPath} responded {StatusCode} in {Elapsed:0.0000} ms",
      "RenderedMessage": "START - HTTP \"GET\" \"/weatherforecast\" responded 200 in 2.1560 ms",
      "Properties": {
        "RequestMethod": "GET",
        "RequestPath": "/weatherforecast",
        "StatusCode": 200,
        "Elapsed": 2.156,
        "SourceContext": "Serilog.AspNetCore.RequestLoggingMiddleware",
        "RequestId": "80000812-0006-f800-b63f-84710c7967bb",
        "SpanId": "|478ab71d-48fb709e612eba23.",
        "TraceId": "478ab71d-48fb709e612eba23",
        "ParentId": "",
        "MachineName": "HOSTNAME_HERE",
        "ThreadId": 9,
        "Application": "APP_NAME_HERE"
      },
      "Renderings": {
        "Elapsed": [
          {
            "Format": "0.0000",
            "Rendering": "2.1560"
          }
        ]
      }
    },
   !-- x amount of more events here --!
  ]
}

I do have wireshark traces that I have sent to F5 and can easily reproduce the issue. Before going on to that, is this the right place to raise this issue?

Also, is this going to be a battle of vendors where each one blames the other and the customer (me) doesn’t get anywhere?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 22 (15 by maintainers)

Most upvoted comments

@deathchurch The current F5 behavior will result in poor performance. The client will send up to the window size, then it will need to wait until it gets a window update from the server. This will result in a stall in sending approx equal to the round trip time.

To improve performance, they should send window updates more frequently. Exactly how frequently is a somewhat complex question. For HttpClient, we send a window update for every approx (total window size / 8) bytes consumed by the app. I believe ASP.NET does something similar. But we’re also looking at changing our receive window size heuristics for 6.0. There’s not necessarily a perfect answer here, but what they are currently doing is definitely suboptimal.

Thanks @geoffkizer I think F5 were of the same opinion, they quoted https://datatracker.ietf.org/doc/html/rfc7540#section-5.2.1, and that on the client side they wait for the window to be filled before sending a window update, also they queried why they would they be expected to update the window if its not being used. I think they were open to any suggestions from Microsoft as to what they could improve.

I see you have a PR, hopefully that will resolve the problem, would that be targeted only to net6.0 or would that be available for net5.0 and netcore3.1?

Curl trace: tcp.stream eq 0

  • F5 sets its connection window size to 32768
  • The client sets its window size to 33554432(?)
  • The client sets sends a connection window update to 33488897
  • POST Content-Length 46486 (smaller than HttpClient?)
  • DATA 16384
  • DATA 16384
  • F5 sends a stream window update of 32768
  • DATA 13718
  • 200 OK response

So F5 is sending a window update in this scenario, which allows the client to continue the request.

Maybe that HttpClient 9s pause is the root issue here, it’s failing to flush that data frame for some reason and F5 doesn’t send a window update until the window is full.

Using .NET 5.0 I was able to reproduce this with Kestrel using a larger request body (100kb) to exceed kestrel’s 96kb stream window size. I made the server avoid consuming the data and sending window updates, an the client times out.

        static async Task Main(string[] args)
        {
            Console.WriteLine("Hello World!");
            using var client = new HttpClient();
            client.Timeout = TimeSpan.FromSeconds(10);
            client.DefaultRequestVersion = HttpVersion.Version20;
            var response = await client.PostAsync("https://localhost:5001", new ByteArrayContent(new byte[100 * 1024]));
            Console.WriteLine(response);
        }
                endpoints.Map("/", async context =>
                {
                    await Task.Delay(TimeSpan.FromSeconds(15));
                    await context.Response.WriteAsync("Hello World!");
                });
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received HEADERS frame for stream ID 1 with length 28 and flags END_HEADERS
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
info: Microsoft.AspNetCore.Hosting.Diagnostics[1]
      Request starting HTTP/2 POST https://localhost:5001/  102400
info: Microsoft.AspNetCore.Routing.EndpointMiddleware[0]
      Executing endpoint '/'
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received DATA frame for stream ID 1 with length 16384 and flags NONE
trce: Microsoft.AspNetCore.Server.Kestrel[37]
      Connection id "0HM8HE4K0ED0D" received RST_STREAM frame for stream ID 1 with length 4 and flags 0x0
trce: Microsoft.AspNetCore.Server.Kestrel[37]

The client sends 5 full frames, 80kb, and then waits for the timeout. Then it sends one more 16kb frame and the RST_STREAM together. 16kb would have put it right at the 96kb limit, but it still should have sent the data. It sounds like it doesn’t flush if it hits the window limit.

I’m going to transfer this over to the HttpClient owners.