runtime: Test failure System.Net.WebSockets.Tests.WebSocketDeflateTests.PayloadShouldHaveSimilarSizeWhenSplitIntoSegments

Run: runtime 20210428.85

Failed test:

net6.0-Linux-Release-arm-CoreCLR_checked-(Alpine.313.Arm32.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:alpine-3.13-helix-arm32v7-20210414141857-1ea6b0a
 -System.Net.WebSockets.Tests.WebSocketDeflateTests.PayloadShouldHaveSimilarSizeWhenSplitIntoSegments(windowBits: 15)

Error message:

System.Threading.Tasks.TaskCanceledException : A task was canceled.


Stack trace
   at System.Net.WebSockets.ManagedWebSocket.SendFrameFallbackAsync(MessageOpcode opcode, Boolean endOfMessage, Boolean disableCompression, ReadOnlyMemory`1 payloadBuffer, CancellationToken cancellationToken) in /_/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.cs:line 564
   at System.Net.WebSockets.Tests.WebSocketDeflateTests.PayloadShouldHaveSimilarSizeWhenSplitIntoSegments(Int32 windowBits) in /_/src/libraries/System.Net.WebSockets/tests/WebSocketDeflateTests.cs:line 444
--- End of stack trace from previous location ---

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 40 (40 by maintainers)

Commits related to this issue

Most upvoted comments

I’ve found the difference between what I was running on my machine and what was failing in CI. I was running on Release runtime, but the test actually ran overtime on Checked runtime. Here on Release runtime, the test takes the same 0.3s I was seeing image And here on Checked runtime, it takes more than 5s image The pipeline with these results is here. I will build Checked runtime on my machine to confirm I have the repro and to see what exactly takes the time.

@BruceForstall this was done in #52086

I agree (this is a reminder to us all 🙂) that when a test fails regularly in CI we should disable it immediately if we can’t fix it immediately. We’re good engineers and prefer to investigate and fix rather than disable anything, but that can happen while it’s disabled. 🙂

I’ve run the measurements multiple times on Checked runtime and it was indeed Random taking all the time there: random part takes 4.8s and deflate part takes 0.3s. When I apply fix from https://github.com/dotnet/runtime/pull/52052, random part reduces to 0.02s 😲. @danmoseley should we pass a word about it to people owning Random? I will reopen @zlatanov’s PR as the fix is working, and I will double-check that in CI 😊

I was racking my brain to think what we were missing, and you figured it out. 👏

and I don’t know how to trigger it

You can run it manually from your branch. Just click on “Run Pipeline” here: https://dev.azure.com/dnceng/public/_build?definitionId=686

Then, on the source branch choose your branch… you can push a branch to the dotnet fork of dotnet/runtime and then it will show up as available to run the pipeline from; or if you have a PR you can use refs/pull/<PRid>.

@danmoseley what makes you say it was always failing?

I spoke imprecisely, I meant when it failed, it was always 14 or 15, try this query:

Execute: Web | Desktop | Web (Lens) | Desktop (SAW)

https://engsrvprod.kusto.windows.net/engineeringdata

TestResults 
| join kind=inner WorkItems on WorkItemId 
| join kind=inner Jobs on JobId
| where Finished >= datetime(2021-3-1 0:00:00)
and Type == "System.Net.WebSockets.Tests.WebSocketDeflateTests"
and Branch == "refs/heads/main"
| summarize count() by Method, Result, QueueAlias, Arguments, Message

Got it. Interesting. Apologies for crossing questions on the PR as well.

@CarnaViire I created a PR, we wait now to see the run times for the test. If I am correct, the fix should be successful. If however you feel that we should remove the test, let me know.

Btw, how could the algorithm change that much so the aggregation of compressed parts of the message will be much bigger/smaller than the compressed whole message?.. wouldn’t it be a different/new algo then, not DEFLATE as it was described in RFC?

The deflate only describes the structure/format of the data. The algorithm is implementation detail that might change (for example zlib-intel vs classic zlib), memory constraints, different performance optimizations that might trade compression ratio for speed.