runtime: How to debug program terminating with uncaught exception of type PAL_SEHException

I’m testing how my dotnet core application behaves on no internet connection. The program always crashes with

libc++abi.dylib: terminating with uncaught exception of type PAL_SEHException

I tried to find what command is triggering it but could not find it by manual trial and error.

I know it has something to do with the network connection since I don’t get this crash when network is stable. I’m seeing fine responses from my external API calls (failing with WebException).

How can I debug this?

I’m using dotnet core 3.0.100 and OSX 10.14.5

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 18 (7 by maintainers)

Most upvoted comments

I’ve fixed this particular issue in #33448 in the master. I am going to create a porting request for 3.1 and probably even 2.1.

+10

janvorli on Mar 13, 2020

@tobbe303 thank you a lot for providing the call stack. Now it is clear what’s causing the trouble. It is a ThreadAbort exception stemming from FuncEval that the debugger calls to evaluate managed properties. This exception is expected to happen when there is a Property that has code that takes too long to execute and debugger decides to stop it as it could be a genuine hang. It needs to be investigated why we don’t handle it properly on Linux. I can quickly see one difference already which seems to be likely the culprit - the FuncEvalHijack assembler helper uses UnhandledExceptionHandlerUnix personality routine on Unix while it uses FuncEvalHijackPersonalityRoutine on Windows. We don’t have an implementation of that method for Unix.

cc: @dotnet/dotnet-diag - I will look into implementing the FuncEvalHijackPersonalityRoutine for Unix

janvorli on Mar 5, 2020

I have actually found the real culprit. Here is the problem. Managed code executed as part of the func eval ends up calling JIT_NewArr1 native function. In our case, when that function is entered, the debugger has already marked the thread for aborting the execution because the func eval was taking too long. The JIT_NewArr1 uses the HELPER_METHOD_FRAME_BEGIN_RET_0 macro to create a HelperMethodFrame explicit frame. But this macro also does few additional things. Together with the HELPER_METHOD_FRAME_END macro, it creates a try / catch around the body of the method that allows native exceptions to be caught and then processed by the DispatchManagedException, as we cannot let it propagate to the managed code. Besides that, it also calls HelperMethodFrame::Push() to push the explicit frame on the per thread explicit frame list. That function further calls (tail-calls in release build) HelperMethodFrame::PushSlowHelper() which checks if the thread is marked for abortion and if it is, it calls (tail-calls in release build) HelperMethodFrame::HandleThreadAbort. And that method calls RaiseTheExceptionInternalOnly which throws PAL_SEHException to initiate the thread abort propagation. The problem is that this is done before the try / catch region. So the PAL_SEHException is not caught and turned into calling the DispatchManagedException as expected, but gets propagated into the managed caller. And since the libunwind doesn’t know anything about the managed code, the exception is considered unhandled and C++ runtime aborts the process.

It seems the fix might be as easy as moving the try / catch (INSTALL_MANAGED_EXCEPTION_DISPATCHER / UNINSTALL_MANAGED_EXCEPTION_DISPATCHER macros) so that the try region contains the HelperMethodFrame::Push(), but it is possible that there will be some devil hidden in the details.

janvorli on Mar 10, 2020

I suppose I should note that I’m getting this occasionally on MacOS.

eltiare on Mar 6, 2020

Actually, there is one more thing and the personality routine is probably not the culprit since I’ve realized that the exception should not pass through the FuncEvalHijack frame.

The real culprit seems to be that Thread::HandleThreadAbort throws the PAL_SEHException for the ThreadAbort exception and it should be caught in JIT_NewArr1 as it uses the HELPER_METHOD_FRAME_BEGIN_RET_0 and propagated further using the managed exception handling through the frame 11 etc . But it seems that for some reason, it doesn’t get caught there, the C++ unwinder hits the managed frame and since it doesn’t know anything about it, it considers the exception unhandled and aborts the process.

janvorli on Mar 5, 2020

terminate called after throwing an instance of 'PAL_SEHException'

I’m getting this using VSCode on PopOS 19.10 as well

Parasrah on Feb 26, 2020