runtime: Hang while waiting for runtime to suspend thread in busy loop
Ran perf tests overnight on Windows 7, WOW64, 7.0 preview 2 at 1615de6c and found one busy hung in the morning.
One thread is trying to allocate an object, and GC has decided to run (fwiw the heap is not large)
0:005> kp
# ChildEBP RetAddr
00 0829f448 75221629 ntdll!ZwWaitForSingleObject(void)+0x15 [d:\win7.obj.x86fre\minkernel\ntdll\wow6432\objfre\i386\usrstubs.asm @ 135]
01 0829f448 75111194 KERNELBASE!WaitForSingleObjectEx(void * hHandle = 0x00000088, unsigned long dwMilliseconds = 1, int bAlertable = 0n0)+0x98 [d:\win7\minkernel\kernelbase\synch.c @ 1301]
02 0829f460 73346fc4 kernel32!WaitForSingleObjectExImplementation(void * hHandle = 0x00000088, unsigned long dwMilliseconds = 1, int bAlertable = 0n0)+0x75 [d:\win7\base\win32\client\synch.c @ 345]
03 (Inline) -------- coreclr!CLREventWaitHelper2(void)+0x9 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 372]
04 0829f4b4 73346f7b coreclr!CLREventWaitHelper(void * handle = <Value unavailable error>, unsigned long dwMilliseconds = <Value unavailable error>, int alertable = 0n0)+0x2d [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 397]
05 0829f4c4 733469c0 coreclr!CLREventBase::WaitEx(unsigned long dwMilliseconds = 1, WaitMode mode = WaitMode_None (0n0), struct PendingSync * syncState = 0x00000000)+0x32 [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 469]
06 0829f4dc 73335628 coreclr!CLREventBase::Wait(unsigned long dwMilliseconds = 1, int alertable = 0n0, struct PendingSync * syncState = 0x00000000)+0x1a [D:\a\_work\1\s\src\coreclr\vm\synch.cpp @ 413]
07 0829f560 733705ac coreclr!ThreadSuspend::SuspendRuntime(ThreadSuspend::SUSPEND_REASON reason = <Value unavailable error>)+0x54a [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 3563]
08 0829f60c 73370452 coreclr!ThreadSuspend::SuspendEE(ThreadSuspend::SUSPEND_REASON reason = SUSPEND_FOR_GC (0n1))+0xf9 [D:\a\_work\1\s\src\coreclr\vm\threadsuspend.cpp @ 5714]
09 0829f634 733708cc coreclr!GCToEEInterface::SuspendEE(SUSPEND_REASON reason = SUSPEND_FOR_GC (0n1))+0x26 [D:\a\_work\1\s\src\coreclr\vm\gcenv.ee.cpp @ 28]
0a 0829f634 73370415 coreclr!WKS::GCHeap::GarbageCollectGeneration(unsigned int gen = 0, gc_reason reason = reason_alloc_soh (0n0))+0xa8 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 45101]
0b 0829f658 732ef1b3 coreclr!WKS::gc_heap::trigger_gc_for_alloc(int gen_number = <Value unavailable error>, gc_reason gr = <Value unavailable error>, struct WKS::GCDebugSpinLock * msl = 0x73681a18, bool loh_p = false)+0x19 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 17075]
0c 0829f68c 733c3bb4 coreclr!WKS::gc_heap::try_allocate_more_space(struct alloc_context * acontext = 0x0255f270, unsigned int size = 0x10, unsigned int flags = 2, int gen_number = 0n0)+0x1e4 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 17221]
0d 0829f6a4 732ee473 coreclr!WKS::gc_heap::allocate_more_space(struct alloc_context * acontext = 0x0255f270, unsigned int size = 0x10, unsigned int flags = 2, int alloc_generation_number = 0n0)+0x18 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 17685]
0e (Inline) -------- coreclr!WKS::gc_heap::allocate(void)+0x40 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 17716]
0f 0829f6c4 732d989b coreclr!WKS::GCHeap::Alloc(struct gc_alloc_context * context = 0x0255f270, unsigned int size = 0x10, unsigned int flags = 2)+0x53 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 44097]
10 (Inline) -------- coreclr!Alloc(GC_ALLOC_FLAGS flags = <Value unavailable error>)+0x108 [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp @ 226]
11 0829f724 732ee14a coreclr!AllocateObject(class MethodTable * pMT = 0x0079e774, bool fHandleCom = <Value unavailable error>)+0x175 [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp @ 979]
12 0829f7a8 007abb6b coreclr!JIT_New(struct CORINFO_CLASS_STRUCT_ * typeHnd_ = 0x0079e774)+0xaa [D:\a\_work\1\s\src\coreclr\vm\jithelpers.cpp @ 2312]
13 0829f7c0 718474f5 MicroBenchmarks!System.Collections.Concurrent.AddRemoveFromDifferentThreads`1+<>c__DisplayClass8_0[[System.Int32, System.Private.CoreLib]].<SetupConcurrentStackIteration>b__0(<Win32 error 0n318>)+0x4b
It seems to be waiting for this thread to suspend
0:006> kp
# ChildEBP RetAddr
00 05f5f880 718474f5 MicroBenchmarks!System.Collections.Concurrent.AddRemoveFromDifferentThreads`1+<>c__DisplayClass8_0[[System.Int32, System.Private.CoreLib]].<SetupConcurrentStackIteration>b__1(<Win32 error 0n318>)+0x43
01 05f5f890 7184b61d System_Private_CoreLib!System.Threading.Tasks.Task.InnerInvoke(System.Threading.Tasks.Task this = <Value unavailable error>)+0x35 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 2411]
and that thread is in a busy loop trying to pop from a ConcurrentStack. the stack is empty, so I believe TryPop is returning false immediately. itโs partially inlined
while (count < Size)
{
if (stack.TryPop(out T _))
{
count++;
}
}
Should we have emitted interruptible code here? Or is it by design that if you write such code you may hang at an unpredicatable moment?
Dumps shared offline.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 20 (20 by maintainers)
That would be a fun thing to document ๐
No. In pure managed code suspension latency (stopping the runtime at safe points) should be in 0.1-1 msec range regardless of what user code is doing. There could be rare outliers going into 10-50 msec range, but anything above that would be worth investigating.