runtime: Segmentation fault on arm32 (raspberry-pi3)
From @SteveL-MSFT on August 29, 2017 22:25
After building powershell with runtime linux-arm, it runs until it hits a second ManualResetEvent::WaitOne() call and results in SegFault. Stack trace from gdb:
Thread 23 "powershell" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x694e1450 (LWP 11108)]
0x76692ecc in VirtualCallStubManager::predictStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
(gdb) backtrace
#0 0x76692ecc in VirtualCallStubManager::predictStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
dotnet/coreclr#1 0x766981d6 in VirtualCallStubManager::getStubKind(unsigned int) () from /home/pi/powershell/libcoreclr.so
dotnet/coreclr#2 0x766951b4 in VirtualCallStubManager::FindStubManager(unsigned int, VirtualCallStubManager::StubKind*) ()
from /home/pi/powershell/libcoreclr.so
dotnet/coreclr#3 0x7669698e in VSD_ResolveWorker () from /home/pi/powershell/libcoreclr.so
dotnet/coreclr#4 0x7673cb30 in ResolveWorkerAsmStub () from /home/pi/powershell/libcoreclr.so
dotnet/coreclr#5 0x687ca346 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)
Copied from original issue: dotnet/corefx#23660
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 18 (15 by maintainers)
Fixed by dotnet/coreclr#13922
I’ve debugged the issue and it is a codegen issue. The ResolveWorkerAsmStub expects to get indirection cell address combined with two flag bits in the register R4, but it gets an address of an argument shuffling thunk instead. The managed frame (the frame dotnet/coreclr#5 in the stack trace in the issue description above) is a frame of the following function:
This function calls an argument shuffling thunk via the
blx r4. The thunk’s code is below:This thunk replaces the
LRpushed by the first push by the value taken from[R0+16]and so the pop at the end jumps to the following piece of code:The values at the pc and pc + 8 are as follows:
So this piece of code jumps to 0xb66f2ced, which is the
ResolveWorkerAsmStubasm helper. And now we are coming to the culprit. As I’ve already said, this asm helper expectsR4to contain the indirection cell address. But as you can see, the argument shuffling thunk didn’t touchR4and so we get theR4that came from theDomainNeutralILStubClass.IL_STUB_SecureDelegate_Invoke. And as you can see,R4was used to jump to the argument shuffling thunk so it contains its address.So I believe this is a JIT codegen bug. If you look at the generated code of the
DomainNeutralILStubClass.IL_STUB_SecureDelegate_Invoke, you can see that at 0xa87e9a38, the indirection cell address was loaded toR4, but right in the next instruction, it was overwritten by the address that theblxcalled a bit later.Here is the problem: https://github.com/dotnet/coreclr/blob/3297fd43b6d78c025e3befa3b6242229deaa9094/src/jit/codegenlegacy.cpp#L18667
Thanks for opening this in the right repo 😃
@mi-hol I am just building coreclr with a fix so that I can test it with powershell on my RPI3. So I think I will probably send out PR with the fix later today.
Also, R4 is loaded as
EA_PTRSIZEin the line above. Instead, it should be loaded asEA_BYREF.Got @janvorli working