runtime: ExecutionEngineException when using step-by-step

Description

When trying to find the source of an “CancelledOperationException” being thrown in background in my app, I stumbled on a weird bug.
When I step in a specific place, it will throw a ExecutionEngineException, and if I try to continue the program execution it will freeze VS2022 (and I must kill it with taskmanager).
As I don’t know where the issue come from, I created the issue here.

Reproduction Steps

Clone my repo https://github.com/Kuinox/CK-LogViewer/commit/ac8a94534f510e83c52401cb267a46bd873c94e4 and checkout on the commit linked (it was the develop branch when this issue was opened).

Run LogSampleGenerator until it hit the Debugger.Break().
Now due to this issue: https://github.com/dotnet/sdk/issues/1458, 3 pdb must be loaded by hand in VS, they are located in the nuget cache:
%userprofile%\.nuget\packages\ck.mqtt.abstractions\0.10.0\lib\net6.0 %userprofile%\.nuget\packages\ck.mqtt.common\0.10.0\lib\net6.0 %userprofile%\.nuget\packages\ck.mqtt.client\0.10.0\lib\net6.0 Now navigate to

CK.MQTT.Client.dll!CK.MQTT.SmallOutgoingApplicationMessageExtensions.PublishAsync(CK.MQTT.IMqtt3Client client, CK.Core.IActivityMonitor m, string topic, CK.MQTT.QualityOfService qos, bool retain, System.ReadOnlyMemory<byte> payload) Line 13	C#
CK.Monitoring.MQTT.dll!CK.Monitoring.MQTT.MQTT.DoHandleAsync(CK.Core.IActivityMonitor m, CK.Monitoring.IMulticastLogEntry entry) Line 29	C#

image Now trying to step in SendPacketAsync throw the ExecutionEngineException.
Pressing F5/F10 here lead to VS freezing.

Expected behavior

Being able to debug my program.

Actual behavior

An ExecutionEngineException is throwed.

Regression?

I don’t know, new app/code.

Known Workarounds

No response

Configuration

.NET 6.0.1 Windows 11/10

Other information

No response

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 21 (21 by maintainers)

Most upvoted comments

Here is a simplified repro case:

  1. Create a new console app project with this code in Visual Studio targetting .NET 6
IClient client = new Client();
client.SendPacketAsync<object>(1, 2, 3, 4);
client.PublishAsync();

static class Extensions
{
    public static void PublishAsync( this IClient client)
    {
        client.SendPacketAsync<object>(1,2,3,4);       // <----- set breakpoint here
    }
}

interface IClient
{
    public void SendPacketAsync<T>(int arg1, int arg2, int arg3, int arg4);
}

class Client : IClient
{
    public void SendPacketAsync<T>(int arg1, int arg2, int arg3, int arg4)
    {
        Console.WriteLine(typeof(T).Name);
    }
}
  1. Set breakpoint at the commented line
  2. Turn off JMC
  3. Run the program under the debugger (F5)
  4. When the code reaches the breakpoint, step in

Expected result: execution pauses inside SendPacketAsync() Actual result: VS shows a dialog for an unhandled ExecutionEngineException

@davidwrighton @AaronRobinsonMSFT @jkoritzinsky - this bug appears to be a long-standing issue with IL stubs, I’m not sure who the owner of that is these days? The short version is that ILStubManager::TraceManager doesn’t appear to correctly handle all the ILStubs that the runtime creates. A few years ago we encountered a very similar problem with a different type of ILStub that this method didn’t handle so this appears to be a repeating problem.

Investigating the dump that @Kuinox sent us, this is the long version of what I think likely has happened:

  1. The debugger is initially stopped in the PublishAsync() method shown above and then we do a step-in operation.
  2. In order to do the step operation the debugger repeatedly predicts forward in the disassembly to determine where execution will go, sets a breakpoint, and then runs to that point. If the scenario had worked properly the debugger would eventually determine that it had executed far enough and break out of the loop, but we don’t get that far in this case.
  3. While still in that loop the debugger executes to IP 00007ffae9037c31 and tries to determine where execution will go next.
// this is part of the disassembly for the source code line "return await client.SendPacketAsync<object>(...)
0:035> u 00007ffa`e9037c11
CK_MQTT_Client!CK.MQTT.SmallOutgoingApplicationMessageExtensions+<PublishAsync>d__0.MoveNext()+0x141:
00007ffa`e9037c11 e86a4bee5e      call    coreclr!JIT_VirtualFunctionPointer (00007ffb`47f1c780)
00007ffa`e9037c16 48894580        mov     qword ptr [rbp-80h],rax
00007ffa`e9037c1a 488d9568ffffff  lea     rdx,[rbp-98h]
00007ffa`e9037c21 488b4d90        mov     rcx,qword ptr [rbp-70h]
00007ffa`e9037c25 4c8b4588        mov     r8,qword ptr [rbp-78h]
00007ffa`e9037c29 4c8b4d98        mov     r9,qword ptr [rbp-68h]
00007ffa`e9037c2d 488b4580        mov     rax,qword ptr [rbp-80h]
00007ffa`e9037c31 ffd0            call    rax                  // <--------------- execution reached here

That call at the end will lead to executing the jitted code for the IL instantiating stub for SendPacketAsync<T>(…)

  1. The debugger observes the call opcode and knows the current value of rax, so it invokes TrapStepInHelper to try figuring out where execution will go to and set a breakpoint. This calls TrapStepInto, which calls into StubManager::TraceStub(). The IP in rax points to 00007ffae9038da0, an IL instantiating stub:
0:035> !ip2md 00007ffa`e9038da0
MethodDesc:   00007ffae91229e8
Method Name:          ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket)
Class:                00007ffae8fc03a8
MethodTable:          00007ffae8fadca8
mdToken:              0000000006000000
Module:               00007ffae8fa8180
IsJitted:             yes
Current CodeAddr:     00007ffae9038da0
Version History:
  ILCodeVersion:      0000000000000000
  ReJIT ID:           0
  IL Addr:            0000000000000000
     CodeAddr:           00007ffae9038da0  (MinOptJitted)
     NativeCodeVersion:  0000000000000000
  1. StubManager::TraceStub() determines that the target address belongs to the ILStubManager because ILStubManager::CheckIsStub_Internal claims all addresses that map to MethodDescs where MethodDesc::IsILStub() is TRUE.

  2. StubManager::TraceStub then asks ILStubManager to predict where the stub will execute to by calling ILStubManager::DoTraceStub(). The implementation of this method returns something called “ManagerPush” with the stub start address (00007ffae9038da0). This is intended for cases where the stub manager can’t predict the destination of the stub given its IP alone. Instead the StubManager is requesting that the debugger set a breakpoint at the address it returned, then call back to the stub manager later once execution has reached that address. At that point the debugger will be able to provide a complete CONTEXT rather than the IP alone.

  3. StubManager code returns back to the debugger, the debugger sets the breakpoint at 00007ffae9038da0 as requested, and returns back through the vectored exception handler which ultimately resumes execution in the jitted user code at 00007ffae9037c31:

00007ffa`e9037c31 ffd0            call    rax
  1. The processor executes the call instruction, immediately arriving at 00007ffae9038da0 and triggering the breakpoint placed by the debugger. This generates an exception that calls back into vectored exception handler and works it way back to the debugger code.
0:035> u
CK_MQTT_Client!ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket):
00007ffa`e9038da0 cc              int     3     // <--------------- execution has now advanced to here
00007ffa`e9038da1 4883ec30        sub     rsp,30h
00007ffa`e9038da5 488d6c2430      lea     rbp,[rsp+30h]
00007ffa`e9038daa 48894d10        mov     qword ptr [rbp+10h],rcx
00007ffa`e9038dae 48895518        mov     qword ptr [rbp+18h],rdx
00007ffa`e9038db2 4c894520        mov     qword ptr [rbp+20h],r8
00007ffa`e9038db6 4c894d28        mov     qword ptr [rbp+28h],r9
00007ffa`e9038dba 48b9787603e9fa7f0000 mov rcx,offset CLRStub[MethodDescPrestub]@7ffae9037678 (00007ffa`e9037678)
  1. The debugger recognizes that it placed this breakpoint at the request of the ILStubManager, and now that it has reached the requested address it invokes back to the ILStubManager again wanting a prediction on where execution will go. This time the debugger provides a complete CONTEXT:
0:035> k
...
22 0000003d`c1b7ba90 00007ffb`480e7bb1     coreclr!ILStubManager::TraceManager+0x11d [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1797] 
23 0000003d`c1b7bac0 00007ffb`482295b0     coreclr!EEDbgInterfaceImpl::TraceManager+0x91 [D:\a\_work\1\s\src\coreclr\vm\eedbginterfaceimpl.cpp @ 1261] 
24 0000003d`c1b7bb70 00007ffb`48227216     coreclr!DebuggerStepper::TriggerPatch+0x160 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 7135] 
25 0000003d`c1b7d9e0 00007ffb`48225381     coreclr!DebuggerController::ScanForTriggers+0x1ba [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2662] 
26 0000003d`c1b7da60 00007ffb`480cea65     coreclr!DebuggerController::DispatchPatchOrSingleStep+0xd1 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 2957] 
27 0000003d`c1b7db40 00007ffb`4803831e     coreclr!DebuggerController::DispatchNativeException+0x96739 [D:\a\_work\1\s\src\coreclr\debug\ee\controller.cpp @ 4299] 
28 0000003d`c1b7db90 00007ffb`48017998     coreclr!Debugger::FirstChanceNativeException+0x5e [D:\a\_work\1\s\src\coreclr\debug\ee\debugger.cpp @ 5574] 
29 0000003d`c1b7dbe0 00007ffb`4801788d     coreclr!IsDebuggerFault+0x6c [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 6592] 
2a 0000003d`c1b7dc20 00007ffb`480177d6     coreclr!CLRVectoredExceptionHandlerPhase2+0x91 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7361] 
2b 0000003d`c1b7dc80 00007ffb`480176da     coreclr!CLRVectoredExceptionHandler+0xc2 [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 7277] 
2c 0000003d`c1b7dd40 00007ffc`2247bfea     coreclr!CLRVectoredExceptionHandlerShim+0xea [D:\a\_work\1\s\src\coreclr\vm\excep.cpp @ 8016] 
2d 0000003d`c1b7dd90 00007ffc`22435812     ntdll!RtlpCallVectoredHandlers+0x112 [minkernel\ntdll\vectxcpt.c @ 204] 
2e (Inline Function) --------`--------     ntdll!RtlCallVectoredExceptionHandlers+0xe [minkernel\ntdll\vectxcpt.c @ 358] 
2f 0000003d`c1b7de30 00007ffc`224a730e     ntdll!RtlDispatchException+0x62 [minkernel\ntos\rtl\amd64\exdsptch.c @ 390] 
30 0000003d`c1b7e080 00007ffa`e9038da0     ntdll!KiUserExceptionDispatch+0x2e [minkernel\ntos\rtl\amd64\trampoln.asm @ 751] 
31 0000003d`c1b7e7b8 00007ffa`e9037c33     CK_MQTT_Client!ILStubClass.IL_STUB_InstantiatingStub(CK.Core.IActivityMonitor, CK.MQTT.IOutgoingPacket)

// switching to frame 0x23 and showing the CONTEXT provided to the ILStubManager
0:035> dx Debugger.Sessions[0].Processes[28316].Threads[23536].Stack.Frames[35].SwitchTo();dv /t /v
0:035> ?? context
struct _CONTEXT * 0x0000003d`c1b7e080
   +0x000 P1Home           : 0x4545470e
   +0x008 P2Home           : 0x00007ffc`22435249
   +0x010 P3Home           : 0x000001aa`d4e18dc0
   +0x018 P4Home           : 0x0000003d`c1b80000
   +0x020 P5Home           : 0x0000003d`c1b7ea28
   +0x028 P6Home           : 0x0000003d`c1b7ead8
   +0x030 ContextFlags     : 0x10005f
   +0x034 MxCsr            : 0x1f80
   +0x038 SegCs            : 0x33
   +0x03a SegDs            : 0x2b
   +0x03c SegEs            : 0x2b
   +0x03e SegFs            : 0x53
   +0x040 SegGs            : 0x2b
   +0x042 SegSs            : 0x2b
   +0x044 EFlags           : 0x202
   +0x048 Dr0              : 0
   +0x050 Dr1              : 0
   +0x058 Dr2              : 0
   +0x060 Dr3              : 0
   +0x068 Dr6              : 0
   +0x070 Dr7              : 0
   +0x078 Rax              : 0x00007ffa`e9037688
   +0x080 Rcx              : 0x000001aa`d4e19b10
   +0x088 Rdx              : 0x0000003d`c1b7e858
   +0x090 Rbx              : 0x000001aa`d4e18dc0
   +0x098 Rsp              : 0x0000003d`c1b7e7b8
   +0x0a0 Rbp              : 0x0000003d`c1b7e8f0
   +0x0a8 Rsi              : 0x0000003d`c1b7ead8
   +0x0b0 Rdi              : 0x0000003d`c1b7ea28
   +0x0b8 R8               : 0x000001aa`d4e183e8
   +0x0c0 R9               : 0x000001ad`d4bc72e8
   +0x0c8 R10              : 0x000001b0`e462f170
   +0x0d0 R11              : 0x4f977cc5
   +0x0d8 R12              : 0x000001aa`d4e182e8
   +0x0e0 R13              : 0x00007ffa`e8474928
   +0x0e8 R14              : 1
   +0x0f0 R15              : 0x4545470e
   +0x0f8 Rip              : 0x00007ffa`e9038da0
   +0x100 FltSave          : _XSAVE_FORMAT
   +0x100 Header           : [2] _M128A
   +0x120 Legacy           : [8] _M128A
   +0x1a0 Xmm0             : _M128A
   +0x1b0 Xmm1             : _M128A
   +0x1c0 Xmm2             : _M128A
   +0x1d0 Xmm3             : _M128A
   +0x1e0 Xmm4             : _M128A
   +0x1f0 Xmm5             : _M128A
   +0x200 Xmm6             : _M128A
   +0x210 Xmm7             : _M128A
   +0x220 Xmm8             : _M128A
   +0x230 Xmm9             : _M128A
   +0x240 Xmm10            : _M128A
   +0x250 Xmm11            : _M128A
   +0x260 Xmm12            : _M128A
   +0x270 Xmm13            : _M128A
   +0x280 Xmm14            : _M128A
   +0x290 Xmm15            : _M128A
   +0x300 VectorRegister   : [26] _M128A
   +0x4a0 VectorControl    : 0x00000160`000000f0
   +0x4a8 DebugControl     : 0
   +0x4b0 LastBranchToRip  : 0
   +0x4b8 LastBranchFromRip : 0
   +0x4c0 LastExceptionToRip : 0
   +0x4c8 LastExceptionFromRip : 0
  1. ILStubManager::TraceManager incorrectly classifies the stub as being either a p/invoke or a CLR-to-COM call because it is the default case of a large if/else if chain:
if(pStubMD->IsMulticastStub()) { ... }
else if(pStubMD->IsReverseStub()) { ... }
else if(pStubMD->IsDelegateStub()) { ... }
else if(pStubMD->IsStubCALLIStub()) { ... }
else if(pStubMD->IsStructMarshalStub() { ... }
else 
{
   // This case is only intended to handle p/invoke and CLR-to-COM call, however no earlier condition matched
   // IL instantiated stub so execution enters here.
}

I’m pretty sure the bug is right here in the incorrect classification, but execution continues a little further doing calculations based on garbage data…

  1. ILStubManager::TraceManager calculates arg, the hidden argument, by reading from R10 in the CONTEXT above: 0x000001b0e462f170. pMD = arg pCMD = pMD pCMD->m_pComPlusCallInfo = 00007ffa4f977cc5 As best I understand R10 is not part of the expected calling convention for invoking an instantiating stub so these computations are based on whatever arbitrary value happened to be in the register at the time.
0:035> ?? (ComPlusCallMethodDesc*)0x000001b0e462f170
class ComPlusCallMethodDesc * 0x000001b0`e462f170
   +0x000 m_wFlags3AndTokenRemainder : 0
   +0x002 m_chunkIndex     : 0 ''
   +0x003 m_bFlags2        : 0 ''
   +0x004 m_wSlotNumber    : 0
   +0x006 m_wFlags         : 0
   =00007ffb`482cb6c0 s_ClassificationSizeTable : [0]  "???"
   +0x008 m_pComPlusCallInfo : 0x00007ffa`4f977cc5 ComPlusCallInfo
  1. Inside ILStubManager::GetCOMTarget the runtime evaluates the expression pComPlusCallInfo->m_pInterfaceMT which is what ultimately triggers the AV that surfaces back to the developer as the ExecutionEngineException
0:035> ?? (ComPlusCallInfo*)0x00007ffa4f977cc5
struct ComPlusCallInfo * 0x00007ffa`4f977cc5
   +0x000 m_pILStub        : ??
   +0x000 m_pEventProviderMD : ???? 
   +0x008 m_pInterfaceMT   : ???? 
   +0x010 m_flags          : ??
   +0x012 m_cachedComSlot  : ??
   +0x018 m_pStubMD        : PlainPointer<MethodDesc *>

The AV occurs at 00007ffb48118e04 which is dereferencing r14+8 = 0x00007ffa4f977cc5+8 = 0x00007ffa4f977ccd

0:035> u
coreclr!ILStubManager::GetCOMTarget+0x74 [D:\a\_work\1\s\src\coreclr\vm\stubmgr.cpp @ 1654]:
00007ffb`48118e04 498b5608        mov     rdx,qword ptr [r14+8]
00007ffb`48118e08 488d4d20        lea     rcx,[rbp+20h]
00007ffb`48118e0c e8c7380c00      call    coreclr!ComObject::GetComIPFromRCWThrowing (00007ffb`481dc6d8)
00007ffb`48118e11 488bf8          mov     rdi,rax
00007ffb`48118e14 4885c0          test    rax,rax
00007ffb`48118e17 750a            jne     coreclr!ILStubManager::GetCOMTarget+0x93 (00007ffb`48118e23)
00007ffb`48118e19 33db            xor     ebx,ebx
00007ffb`48118e1b 48895dc0        mov     qword ptr [rbp-40h],rbx

0:035> dx Debugger.Sessions[0].Processes[28316].Threads[23536].Stack.Frames[24].SwitchTo()
0:035> ?? pExceptionInfo->ExceptionRecord->ExceptionInformation[1]
unsigned int64 0x00007ffa`4f977ccd

Thanks @danmoseley! @Kuinox I’ve downloaded the dump so it’s safe to delete the uploaded files.

Ok, I edited the ticket to remove the link to the uploaded dump.