runtime: How to debug program terminating with uncaught exception of type PAL_SEHException
I’m testing how my dotnet core application behaves on no internet connection. The program always crashes with
libc++abi.dylib: terminating with uncaught exception of type PAL_SEHException
I tried to find what command is triggering it but could not find it by manual trial and error.
I know it has something to do with the network connection since I don’t get this crash when network is stable. I’m seeing fine responses from my external API calls (failing with WebException).
How can I debug this?
I’m using dotnet core 3.0.100
and OSX 10.14.5
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 18 (7 by maintainers)
I’ve fixed this particular issue in #33448 in the master. I am going to create a porting request for 3.1 and probably even 2.1.
@tobbe303 thank you a lot for providing the call stack. Now it is clear what’s causing the trouble. It is a ThreadAbort exception stemming from FuncEval that the debugger calls to evaluate managed properties. This exception is expected to happen when there is a Property that has code that takes too long to execute and debugger decides to stop it as it could be a genuine hang. It needs to be investigated why we don’t handle it properly on Linux. I can quickly see one difference already which seems to be likely the culprit - the
FuncEvalHijack
assembler helper usesUnhandledExceptionHandlerUnix
personality routine on Unix while it usesFuncEvalHijackPersonalityRoutine
on Windows. We don’t have an implementation of that method for Unix.cc: @dotnet/dotnet-diag - I will look into implementing the
FuncEvalHijackPersonalityRoutine
for UnixI have actually found the real culprit. Here is the problem. Managed code executed as part of the func eval ends up calling
JIT_NewArr1
native function. In our case, when that function is entered, the debugger has already marked the thread for aborting the execution because the func eval was taking too long. TheJIT_NewArr1
uses theHELPER_METHOD_FRAME_BEGIN_RET_0
macro to create a HelperMethodFrame explicit frame. But this macro also does few additional things. Together with theHELPER_METHOD_FRAME_END
macro, it creates a try / catch around the body of the method that allows native exceptions to be caught and then processed by the DispatchManagedException, as we cannot let it propagate to the managed code. Besides that, it also callsHelperMethodFrame::Push()
to push the explicit frame on the per thread explicit frame list. That function further calls (tail-calls in release build)HelperMethodFrame::PushSlowHelper()
which checks if the thread is marked for abortion and if it is, it calls (tail-calls in release build)HelperMethodFrame::HandleThreadAbort
. And that method callsRaiseTheExceptionInternalOnly
which throwsPAL_SEHException
to initiate the thread abort propagation. The problem is that this is done before the try / catch region. So thePAL_SEHException
is not caught and turned into calling theDispatchManagedException
as expected, but gets propagated into the managed caller. And since the libunwind doesn’t know anything about the managed code, the exception is considered unhandled and C++ runtime aborts the process.It seems the fix might be as easy as moving the try / catch (
INSTALL_MANAGED_EXCEPTION_DISPATCHER
/UNINSTALL_MANAGED_EXCEPTION_DISPATCHER
macros) so that the try region contains theHelperMethodFrame::Push()
, but it is possible that there will be some devil hidden in the details.I suppose I should note that I’m getting this occasionally on MacOS.
Actually, there is one more thing and the personality routine is probably not the culprit since I’ve realized that the exception should not pass through the FuncEvalHijack frame.
The real culprit seems to be that
Thread::HandleThreadAbort
throws thePAL_SEHException
for the ThreadAbort exception and it should be caught in JIT_NewArr1 as it uses the HELPER_METHOD_FRAME_BEGIN_RET_0 and propagated further using the managed exception handling through the frame 11 etc . But it seems that for some reason, it doesn’t get caught there, the C++ unwinder hits the managed frame and since it doesn’t know anything about it, it considers the exception unhandled and aborts the process.terminate called after throwing an instance of 'PAL_SEHException'
I’m getting this using VSCode on PopOS 19.10 as well