runtime: System.Text.Json.Tests crashing with segfault in CI

Build: https://dev.azure.com/dnceng/public/_build/results?buildId=1287656&view=ms.vss-test-web.build-test-results-tab&runId=38015200&resultId=183344&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

Configuration: net6.0-Linux-Release-arm64-CoreCLR_checked-(Alpine.312.Arm64.Open)Ubuntu.1804.ArmArch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:alpine-3.12-helix-arm64v8-20200602002604-25f8a3e

/root/helix/work/workitem /root/helix/work/workitem
  Discovering: System.Text.Json.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Text.Json.Tests (found 2741 of 2798 test cases)
  Starting:    System.Text.Json.Tests (parallel test collections = on, max threads = 4)
./RunTests.sh: line 162:    93 Segmentation fault      (core dumped) "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Text.Json.Tests.runtimeconfig.json --depsfile System.Text.Json.Tests.deps.json xunit.console.dll System.Text.Json.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing $RSP_FILE
/root/helix/work/workitem
----- end Wed Aug 11 10:28:18 UTC 2021 ----- exit code 139 ----------------------------------------------------------
exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.

Console log: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-57193-merge-32896558dfa04faba1/System.Text.Json.Tests/1/console.0d8519e9.log?sv=2019-07-07&se=2021-08-31T10%3A24%3A36Z&sr=c&sp=rl&sig=sW6tkW0yTybYwRggqu%2FJypbei5TuvxQeYGIra%2BV%2BBa8%3D

Dump: https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-57193-merge-32896558dfa04faba1/System.Text.Json.Tests/1/core.1001.93?sv=2019-07-07&se=2021-08-31T10%3A24%3A36Z&sr=c&sp=rl&sig=sW6tkW0yTybYwRggqu%2FJypbei5TuvxQeYGIra%2BV%2BBa8%3D

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 21 (21 by maintainers)

Most upvoted comments

Btw, in how_to_debug_dump.md, there is a bug in this section:

Within lldb:
```
setclrpath ~/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/6.0.0
sethostruntime /usr/bin/dotnet
setsymbolserver -directory ~/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/6.0.0
```

it should be:

Within lldb:
```
setclrpath /home/<your username>/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/6.0.0
sethostruntime /usr/bin/dotnet
setsymbolserver -directory /home/<your username>/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/6.0.0
```

as lldb doesn’t resolve shell’s shortcuts ~ to homedir and (the cryptic) error looks similar to what Eirik pointed out above:

...
(lldb) setsymbolserver -directory ~/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/7.0.0
Added symbol directory path: ~/helix_payload/System.Text.Json.Tests/shared/Microsoft.NETCore.App/7.0.0
(lldb) clrstack
Failed to load data access module, 0x80004002
Can not load or initialize libmscordaccore.so. The target runtime may not be initialized.

For more information see https://go.microsoft.com/fwlink/?linkid=2135652
ClrStack  failed

Another issue is that we should replace 6.0.0 with 7.0.0, as most of these commands do not validate if 6.0.0 directory actually exists. (I had to repeat the steps a few times to realize these gotchas 😁)

@am11 thanks for pointing out the errors, perhaps you could appropriately tweak https://github.com/dotnet/runtime/blob/main/eng/testing/debug-dump-template.md (which is processed to generate this file) – you can just click the pencil button to edit it in Github directly, and PR validation will not be necessary.