diagnostics: dotnet dump analyze's `gcroot -all ` crashes on arm64
Description
I created a hello-world ASP.NET Core application (literally dotnet new web; dotnet build; dotnet bin/Debug/net*/App.dll), then created a dump via dotnet-dump.
I then used dotnet-dump analyze to examine it:
$ dotnet dump analyze coredump.34662 --command dso --command exit
Loading core dump: coredump.34662 ...
OS Thread Id: 0x8766 (0)
SP/REG Object Name
x14 0000ffbf65437160 System.Object
0000FFFFD8B0D480 0000ffbf65437160 System.Object
0000FFFFD8B0D560 0000ffbf65437160 System.Object
0000FFFFD8B0D700 0000ffbf65437160 System.Object
0000FFFFD8B0D708 0000ffbf65437050 System.Threading.Tasks.Task+SetOnInvokeMres
0000FFFFD8B0D770 0000ffbf65437050 System.Threading.Tasks.Task+SetOnInvokeMres
0000FFFFD8B0D7A0 0000ffbf65437050 System.Threading.Tasks.Task+SetOnInvokeMres
0000FFFFD8B0D7A8 0000ffbf65436f78 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions+<RunAsync>d__4, Microsoft.Extensions.Hosting.Abstractions]]
0000FFFFD8B0D7C0 0000ffbf65436f78 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions+<RunAsync>d__4, Microsoft.Extensions.Hosting.Abstractions]]
0000FFFFD8B0D7D8 0000ffbf64806bb8 System.Threading.Tasks.TplEventSource
0000FFFFD8B0D800 0000ffbf65436f78 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions+<RunAsync>d__4, Microsoft.Extensions.Hosting.Abstractions]]
0000FFFFD8B0D878 0000ffbf65436f78 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions+<RunAsync>d__4, Microsoft.Extensions.Hosting.Abstractions]]
0000FFFFD8B0D890 0000ffbf6609b808 Program+<>c
0000FFFFD8B0D898 0000ffbf6609b820 System.Func`1[[System.String, System.Private.CoreLib]]
0000FFFFD8B0D8A0 0000ffbf6609e200 Microsoft.AspNetCore.Builder.RouteHandlerBuilder
0000FFFFD8B0D8A8 0000ffbf6609b820 System.Func`1[[System.String, System.Private.CoreLib]]
0000FFFFD8B0D8B0 0000ffbf66010180 System.String /
0000FFFFD8B0D8B8 0000ffbf660512e0 Microsoft.AspNetCore.Builder.WebApplication
0000FFFFD8B0D8C8 0000ffbf660512e0 Microsoft.AspNetCore.Builder.WebApplication
0000FFFFD8B0D8D0 0000ffbf660101d0 Microsoft.AspNetCore.Builder.WebApplicationBuilder
0000FFFFD8B0D8D8 0000ffbf660512e0 Microsoft.AspNetCore.Builder.WebApplication
0000FFFFD8B0D8E0 0000ffbf660101d0 Microsoft.AspNetCore.Builder.WebApplicationBuilder
0000FFFFD8B0D8E8 0000ffbf6600efa0 System.String[]
0000FFFFD8B0DA00 0000ffbf6600efa0 System.String[]
0000FFFFD8B0DBF8 0000ffbf6600efa0 System.String[]
0000FFFFD8B0DC20 0000ffbf6600efa0 System.String[]
0000FFFFD8B0DD90 0000ffbf6600efa0 System.String[]
0000FFFFD8B0E068 0000ffbf6600efa0 System.String[]
$ dotnet dump analyze coredump.34662 --command 'gcroot -all 0000ffbf65437050' --command exit
This crashes.
Backtrace from lldb:
Process 37613 stopped
* thread #1, name = 'dotnet-dump', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffffffffffffff)
frame #0: 0x0000ffbec3e4aa9c libmscordaccore.so`DacStackReferenceWalker::GCEnumCallbackSOS(void*, __DPtr<Object>*, unsigned int, _DAC_SLOT_LOCATION) + 64 libmscordaccore.so`DacStackReferenceWalker::GCEnumCallbackSOS:
-> 0xffbec3e4aa9c <+64>: ldr x22, [x21]
0xffbec3e4aaa0 <+68>: mov x21, xzr
0xffbec3e4aaa4 <+72>: tbnz w19, #0x0, 0xffbec3e4aaf4 ; <+152>
0xffbec3e4aaa8 <+76>: b 0xffbec3e4ab20 ; <+196>
(lldb) bt
* thread #1, name = 'dotnet-dump', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffffffffffffff)
* frame #0: 0x0000ffbec3e4aa9c libmscordaccore.so`DacStackReferenceWalker::GCEnumCallbackSOS(void*, __DPtr<Object>*, unsigned int, _DAC_SLOT_LOCATION) + 64
frame #1: 0x0000ffbec3e61150 libmscordaccore.so`GcInfoDecoder::EnumerateLiveSlots(REGDISPLAY*, bool, unsigned int, void (*)(void*, __DPtr<Object>*, unsigned int, _DAC_SLOT_LOCATION), void*) + 4984
frame #2: 0x0000ffbec3e37720 libmscordaccore.so`EECodeManager::EnumGcRefs(REGDISPLAY*, EECodeInfo*, unsigned int, void (*)(void*, __DPtr<Object>*, unsigned int, _DAC_SLOT_LOCATION), void*, unsigned int) + 280
frame #3: 0x0000ffbec3e4b3f4 libmscordaccore.so`DacStackReferenceWalker::Callback(CrawlFrame*, void*) + 352
frame #4: 0x0000ffbec3e31820 libmscordaccore.so`Thread::StackWalkFramesEx(REGDISPLAY*, StackWalkAction (*)(CrawlFrame*, void*), void*, unsigned int, __VPtr<Frame>) + 372
frame #5: 0x0000ffbec3e31b8c libmscordaccore.so`Thread::StackWalkFrames(StackWalkAction (*)(CrawlFrame*, void*), void*, unsigned int, __VPtr<Frame>) + 13$
frame #6: 0x0000ffbec3e4bd3c libmscordaccore.so`unsigned int DacStackReferenceWalker::WalkStack<unsigned int, _SOS_StackRefData>(unsigned int, _SOS_StackRefData*, void (*)(__DPtr<__DPtr<Object> >, ScanContext*, unsigned int), void (*)(void*, __DPtr<Object>*, unsigned int, _DAC_SLOT_LOCATION)) + 212
frame #7: 0x0000ffbec3e4a6bc libmscordaccore.so`DacStackReferenceWalker::GetCount(unsigned int*) + 180
frame #8: 0x0000ffbec8218900 libsos.so`___lldb_unnamed_symbol895 + 152
frame #9: 0x0000ffbec81ce6a4 libsos.so`___lldb_unnamed_symbol391 + 88
frame #10: 0x0000ffbec81cc384 libsos.so`___lldb_unnamed_symbol371 + 236
frame #11: 0x0000ffbec81cbe5c libsos.so`___lldb_unnamed_symbol368 + 408
frame #12: 0x0000ffbec81f6424 libsos.so`GCRoot + 432
frame #13: 0x0000ffff73b12a2c
frame #14: 0x0000ffff73b1286c
frame #15: 0x0000ffff73b12728
frame #16: 0x0000ffff73b123ac
frame #17: 0x0000ffffb21f2d74 libcoreclr.so`CallDescrWorkerInternal + 132
frame #18: 0x0000ffffb204d8f0 libcoreclr.so`CallDescrWorkerWithHandler(CallDescrData*, int) + 132
frame #19: 0x0000ffffb20f2ff4 libcoreclr.so`RuntimeMethodHandle::InvokeMethod(Object*, void**, SignatureNative*, bool) + 1672
frame #20: 0x0000ffff711291dc
frame #21: 0x0000ffff7113e2f8
frame #22: 0x0000ffff73b120ec
frame #23: 0x0000ffff71b007b8
frame #24: 0x0000ffff71b00700
frame #25: 0x0000ffff71aec5ec
frame #26: 0x0000ffff71aec0bc
frame #27: 0x0000ffff71aebab8
frame #28: 0x0000ffff71aeaf54
frame #29: 0x0000ffff71ade610
frame #30: 0x0000ffff71ac4b9c
frame #31: 0x0000ffffb21f2d74 libcoreclr.so`CallDescrWorkerInternal + 132
frame #32: 0x0000ffffb204d8f0 libcoreclr.so`CallDescrWorkerWithHandler(CallDescrData*, int) + 132
frame #33: 0x0000ffffb20f2ff4 libcoreclr.so`RuntimeMethodHandle::InvokeMethod(Object*, void**, SignatureNative*, bool) + 1672
frame #34: 0x0000ffff711291dc
frame #35: 0x0000ffff7113e4b0
frame #36: 0x0000ffff70f08d50
frame #37: 0x0000ffff71ac0f8c
frame #38: 0x0000ffff71ac0bfc
frame #39: 0x0000ffff71ac0b48
frame #40: 0x0000ffff71ac0ae4
frame #41: 0x0000ffff71ac08c0
frame #42: 0x0000ffff71ac06e4
frame #43: 0x0000ffff71ac0630
frame #44: 0x0000ffff71ac05d0
frame #45: 0x0000ffff71abc0d4
frame #46: 0x0000ffff71ac03a4
frame #47: 0x0000ffff71ac01d4
frame #48: 0x0000ffff71ac0120
frame #49: 0x0000ffff71ac00c0
frame #50: 0x0000ffff71abc0d4
frame #51: 0x0000ffff71abfd88
frame #52: 0x0000ffff71abfc14
frame #53: 0x0000ffff71abfb60
frame #54: 0x0000ffff71abfb00
frame #55: 0x0000ffff71abc0d4
frame #56: 0x0000ffff71abf694
frame #57: 0x0000ffff71abf484
frame #58: 0x0000ffff71abf3d0
frame #59: 0x0000ffff71abf370
frame #60: 0x0000ffff71abc0d4
frame #61: 0x0000ffff71abf13c
frame #62: 0x0000ffff71abeea4
frame #63: 0x0000ffff71abedf0
frame #64: 0x0000ffff71abed90
frame #65: 0x0000ffff71abc0d4
frame #66: 0x0000ffff71abeb90
frame #67: 0x0000ffff71abe8dc
frame #68: 0x0000ffff71abe828
frame #69: 0x0000ffff71abe7c8
frame #70: 0x0000ffff71abc0d4
frame #71: 0x0000ffff71abe5d0
frame #72: 0x0000ffff71abe3f4
frame #73: 0x0000ffff71abe340
frame #74: 0x0000ffff71abe2e0
frame #75: 0x0000ffff71abc0d4
frame #76: 0x0000ffff71abe084
frame #77: 0x0000ffff71abdcd4
frame #78: 0x0000ffff71abdc20
frame #79: 0x0000ffff71abdbc0
frame #80: 0x0000ffff71abc0d4
frame #81: 0x0000ffff71abce70
frame #82: 0x0000ffff71abcb94
frame #83: 0x0000ffff71abcae0
frame #84: 0x0000ffff71abca80
frame #85: 0x0000ffff71abc0d4
frame #86: 0x0000ffff71abc888
frame #87: 0x0000ffff71abc0d4
frame #88: 0x0000ffff71abc3a8
frame #89: 0x0000ffff71abc284
frame #90: 0x0000ffff71abc1d0
frame #91: 0x0000ffff71abc170
frame #92: 0x0000ffff71abc0d4
frame #93: 0x0000ffff71abbce4
frame #94: 0x0000ffff71abbacc
frame #95: 0x0000ffff71abba18
frame #96: 0x0000ffff71abb9b8
frame #97: 0x0000ffff71abb8dc
frame #98: 0x0000ffff71abb8dc
frame #99: 0x0000ffff71abb8dc
frame #100: 0x0000ffff71abb8dc
frame #101: 0x0000ffff71abb8dc
frame #102: 0x0000ffff71abb8dc
frame #103: 0x0000ffff71abb8dc
frame #104: 0x0000ffff71abb8dc
frame #105: 0x0000ffff71abb8dc
frame #106: 0x0000ffff71abb8dc
frame #107: 0x0000ffff71abb8dc
frame #108: 0x0000ffff71aba1cc
frame #109: 0x0000ffff71ab9f5c
frame #110: 0x0000ffff71ab9ea8
frame #111: 0x0000ffff71ab9e44
frame #112: 0x0000ffff71ab9b7c
frame #113: 0x0000ffff71ab9a3c
frame #114: 0x0000ffff71ab9988
frame #115: 0x0000ffff71ab9924
frame #116: 0x0000ffff71aa9d40
frame #117: 0x0000ffff71aa9c04
frame #118: 0x0000ffff71aa9b50
frame #119: 0x0000ffff71aa9ab8
frame #120: 0x0000ffff71a90f48
frame #121: 0x0000ffff71a90d78
frame #122: 0x0000ffffb21f2d74 libcoreclr.so`CallDescrWorkerInternal + 132
frame #123: 0x0000ffffb204dfb8 libcoreclr.so`MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 816
frame #124: 0x0000ffffb1f3fc8c libcoreclr.so`RunMain(MethodDesc*, short, int*, PtrArray**) + 756
frame #125: 0x0000ffffb1f3ff6c libcoreclr.so`Assembly::ExecuteMainMethod(PtrArray**, int) + 408
frame #126: 0x0000ffffb1f6a110 libcoreclr.so`CorHost2::ExecuteAssembly(unsigned int, char16_t const*, int, char16_t const**, unsigned int*) + 636
frame #127: 0x0000ffffb1f2cbac libcoreclr.so`coreclr_execute_assembly + 240
frame #128: 0x0000ffffb24a4d70 libhostpolicy.so`run_app_for_context(hostpolicy_context_t const&, int, char const**) + 1368
frame #129: 0x0000ffffb24a5084 libhostpolicy.so`run_app(int, char const**) + 72
frame #130: 0x0000ffffb24a5a0c libhostpolicy.so`corehost_main + 200
frame #131: 0x0000ffffb253002c libhostfxr.so`fx_muxer_t::handle_exec_host_command(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, host_startup_info_t const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<known_options, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, known_options_hash, std::equal_to<known_options>, std::allocator<std::pair<known_options const, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > const&, int, char const**, int, host_mode_t, bool, char*, int, int*) + 1256
frame #132: 0x0000ffffb252f2a4 libhostfxr.so`fx_muxer_t::execute(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, char const**, host_startup_info_t const&, char*, int, int*) + 704
frame #133: 0x0000ffffb252bab0 libhostfxr.so`hostfxr_main_startupinfo + 172
frame #134: 0x0000aaaab08444a0 dotnet-dump`exe_start(int, char const**) + 1252
frame #135: 0x0000aaaab0844748 dotnet-dump`main + 144
frame #136: 0x0000ffffb25b4384 libc.so.6`__libc_start_main + 220
frame #137: 0x0000aaaab0837c14 dotnet-dump`_start + 52
The full test is here: https://github.com/redhat-developer/dotnet-regular-tests/blob/1b7774d6367751500e440a78909606cab0303a18/debugging-via-dotnet-dump/test.sh
Configuration
- Is this related to a specific tool?
dotnet dump analyze <corefile>, thendsoand thengcroot -all objectwhere object is the first non-System.Objectfrom the output ofdso. In my case it’s0000FFFFD8B0D708 0000ffbf65437050 System.Threading.Tasks.Task+SetOnInvokeMres - What OS and version, and what distro if applicable? RHEL 8, arm64, .NET runtime built via source-build
- What is the architecture (x64, x86, ARM, ARM64)? arm64
- Do you know whether it is specific to that configuration? It doesn’t happen on x64
- Are you running in any particular type of environment? (e.g. Containers, a cloud scenario, app you are trying to target is a different user) Running this in a freshly provision VM. .NET was self-built using source-build
- Is it a self-contained published application? No
- What’s the output of
dotnet info
$ dotnet --info
.NET SDK:
Version: 7.0.103
Commit: 6359034b09
Runtime Environment:
OS Name: rhel
OS Version: 8
OS Platform: Linux
RID: rhel.8-arm64
Base Path: /usr/lib64/dotnet/sdk/7.0.103/
Host:
Version: 7.0.3
Architecture: arm64
Commit: 0a2bda10e8
.NET SDKs installed:
7.0.103 [/usr/lib64/dotnet/sdk]
.NET runtimes installed:
Microsoft.AspNetCore.App 7.0.3 [/usr/lib64/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 7.0.3 [/usr/lib64/dotnet/shared/Microsoft.NETCore.App]
Other architectures found:
None
Environment variables:
DOTNET_ROOT [/usr/lib64/dotnet]
global.json file:
Not found
Learn more:
https://aka.ms/dotnet/info
Download .NET:
https://aka.ms/dotnet/download
–>
Regression?
IIRC, this works on x64 without any issues, just fails on arm64. It was working on arm64 in a previous release of .NET as well, though I am not sure whether that was .NET 6 or .NET 7.
Other information
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 18 (18 by maintainers)
Commits related to this issue
- Fix gcroot SOS command on arm/arm64 Faulted in DAC because the HelperMethodFrame's REGDISPLAY CurrentContextPointers were not initialized correctly. Fixes issue https://github.com/dotnet/diagnostics... — committed to mikem8361/runtime by mikem8361 a year ago
- Fix gcroot SOS command on arm/arm64 (#90650) Faulted in DAC because the HelperMethodFrame's REGDISPLAY CurrentContextPointers were not initialized correctly. Fixes issue https://github.com/dotnet/... — committed to dotnet/runtime by mikem8361 a year ago
- Fix gcroot SOS command on arm/arm64 Faulted in DAC because the HelperMethodFrame's REGDISPLAY CurrentContextPointers were not initialized correctly. Fixes issue https://github.com/dotnet/diagnostics... — committed to dotnet/runtime by mikem8361 a year ago
- Fix gcroot SOS command on arm/arm64 (#90658) Faulted in DAC because the HelperMethodFrame's REGDISPLAY CurrentContextPointers were not initialized correctly. Fixes issue https://github.com/dotnet/... — committed to dotnet/runtime by github-actions[bot] 10 months ago
Ok, saw your new comment on the PR. x64 should be fixed by another change.
Thanks!
So the big question here is why isn’t the catch block handling this? DacStackReferenceWalker::GetCount is wrapped in an SOS enter/leave:
https://github.com/dotnet/runtime/blob/main/src/coreclr/debug/daccess/daccess.cpp#L7859-L7873
Which is defined here:
https://github.com/dotnet/runtime/blob/main/src/coreclr/debug/daccess/dacimpl.h#L3984-L4004
We should be hitting that
EX_END_CATCH(SwallowAllExceptions), and not bringing down the debugger.Obviously it’s not good that something is causing the underlying crash (I’m working on a fix in .Net 8), but regardless, this should have manifested as a failed function call and not in bringing down the debugger.