runtime: [linux-arm64] Random and rare runtime crash System.ArgumentOutOfRangeException (System.Net.Sockets)

Description

Description Random and rare crashes with this exception:

Unhandled exception. System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values. (Parameter 'state')
  at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.InvokeContinuation(Action`1 continuation, Object state, Boolean forceAsync, Boolean requiresExecutionContextFlow)
  at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.OnCompleted(SocketAsyncEventArgs _)
  at System.Net.Sockets.SocketAsyncEngine.System.Threading.IThreadPoolWorkItem.Execute()
  at System.Threading.ThreadPoolWorkQueue.Dispatch()
  at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()

It seems to append only on loaded applications.

Exit signal: Abort (6)

Reproduction Steps

We don’t have any reproduction yet. We probably need to heavily stress network! It seems to be a race condition.

Expected behavior

don’t crash the runtime when we are using sockets…

Actual behavior

random and rare crashes of the runtime

Regression?

No response

Known Workarounds

No response

Configuration

Dotnet runtime version: 6.0.6 OS : GNU/Linux Debian 11 Bullseye CPU: ARM64 Graviton 2 (AWS) We are using Orleans with this application

Other information

follow up of https://github.com/dotnet/runtime/issues/70486 we triple checked all usages of ValueTask and removed all usages of it, just to be sure this time, this is notn some ValueTasks awaited twice

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 19 (13 by maintainers)

Most upvoted comments

I’m also getting the same error. It looks like it’s also appearing here: https://github.com/aws/aws-lambda-dotnet/issues/1244.

My config: Dotnet runtime version: v7.0.100-preview.7 (ARM64) OS : macOS 12.5 (Monterey) CPU: ARM64 Apple Silicon M1 Max

This happens (again, intermittently) when I’m running/debugging a few microservices (on Kestrel - was unsure if this was a Kestrel issue, but saw this reported here).

One additional piece of info is that in the framework method that throws the exception:

throw new ArgumentOutOfRangeException(GetArgumentName(argument));

The argument parameter is always (int) 40 .

Just like the linked AWS issue, none of my exception handlers seem to be catching the error.

Any ideas on next steps? I can’t seem to isolate the exception for repo…

https://github.com/orgs/dotnet/teams/dnceng any chance the limit can be increased?

Unfortunately this limitation comes about from our support of on-premises machines; these tend to cost us lots of time and money uploading large dumps which are often ignored despite this time/financial cost.

If you need to check out a machine with the same specifications as the test one, that can likely be arranged, we’d just need to know the specific queue that this work item ran on (or have its full log linked, etc)