runtime: CLRJit Access Violation on when attempting to run Windows ARM64 app
Description
Also tracked: https://github.com/dotnet/performance/issues/2421 In performance runs for Windows ARM64 we are failing due to run benchmarks due to this error:
[2022/05/09 06:29:58][INFO] Process terminated. Infinite recursion during resource lookup within System.Private.CoreLib. This may be a bug in System.Private.CoreLib, or potentially in certain extensibility points such as assembly resolve events or CultureInfo names. Resource name: Arg_AccessViolationException
[2022/05/09 06:29:58][INFO] at System.Environment.FailFast(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.InternalGetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.GetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.AccessViolationException..ctor()
[2022/05/09 06:29:58][INFO] at System.SpanHelpers.IndexOf(Char ByRef, Char, Int32)
[2022/05/09 06:29:58][INFO] at System.String.Ctor(Char*)
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureData.GetLocaleInfoEx(System.String, UInt32)
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.GetUserDefaultLocaleName()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo..cctor()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.get_CachedCulturesByName()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.GetCultureInfo(System.String)
[2022/05/09 06:29:58][INFO] at System.Resources.ManifestBasedResourceGroveler.GetNeutralResourcesLanguage(System.Reflection.Assembly, System.Resources.UltimateResourceFallbackLocation ByRef)
[2022/05/09 06:29:58][INFO] at System.Resources.ResourceManager.CommonAssemblyInit()
[2022/05/09 06:29:58][INFO] at System.SR.get_ResourceManager()
[2022/05/09 06:29:58][INFO] at System.SR.InternalGetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.GetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.AccessViolationException..ctor()
[2022/05/09 06:29:58][INFO] at System.SpanHelpers.IndexOf(Char ByRef, Char, Int32)
[2022/05/09 06:29:58][INFO] at System.String.Ctor(Char*)
[2022/05/09 06:29:58][INFO] at System.AppContext.Setup(Char**, Char**, Int32)
Looking deeper into this error and testing our built CoreRun.exe and the hello world console app locally, we were able to get the following error and stack trace:
(32cc.44a8): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** WARNING: Unable to verify checksum for C:\Users\parkerbibus\Desktop\TestPayload\Core_Root\clrjit.dll
clrjit!getLikelyClasses+0x18134:
00007ffb`153bdaf4 b86e78ea ldr w10,[x7,x14 lsl #2]
# Child-SP RetAddr Call Site
00 (Inline Function) --------`-------- clrjit!LinearScan::RegisterSelection::calculateCoversSets+0xb8 [D:\a\_work\1\s\src\coreclr\jit\lsra.cpp @ 11546]
01 (Inline Function) --------`-------- clrjit!LinearScan::RegisterSelection::try_COVERS+0xb8 [D:\a\_work\1\s\src\coreclr\jit\lsra.cpp @ 11096]
02 (Inline Function) --------`-------- clrjit!LinearScan::RegisterSelection::select+0x890 [D:\a\_work\1\s\src\coreclr\jit\lsra_score.h @ 27]
03 000000ba`d61ca990 00007ffb`153c01a0 clrjit!LinearScan::allocateReg+0x8bc [D:\a\_work\1\s\src\coreclr\jit\lsra.cpp @ 2754]
04 000000ba`d61caa50 00007ffb`153bc1b8 clrjit!LinearScan::allocateRegisters+0xd08 [D:\a\_work\1\s\src\coreclr\jit\lsra.cpp @ 5381]
05 000000ba`d61cab40 00007ffb`1532eccc clrjit!LinearScan::doLinearScan+0x1c8 [D:\a\_work\1\s\src\coreclr\jit\lsra.cpp @ 1246]
06 (Inline Function) --------`-------- clrjit!Compiler::compCompile::__l2::<lambda_27ba776e5f59c8b10b5aa52fbc87aa8b>::operator()+0x24 [D:\a\_work\1\s\src\coreclr\jit\compiler.cpp @ 5059]
07 000000ba`d61cab80 00007ffb`153ebdbc clrjit!ActionPhase<<lambda_27ba776e5f59c8b10b5aa52fbc87aa8b> >::DoPhase+0x2c [D:\a\_work\1\s\src\coreclr\jit\phase.h @ 65]
08 000000ba`d61cab90 00007ffb`1532c588 clrjit!Phase::Run+0x4c [D:\a\_work\1\s\src\coreclr\jit\phase.cpp @ 62]
09 (Inline Function) --------`-------- clrjit!DoPhase+0x34 [D:\a\_work\1\s\src\coreclr\jit\phase.h @ 78]
0a 000000ba`d61cabb0 00007ffb`1532d540 clrjit!Compiler::compCompile+0x830 [D:\a\_work\1\s\src\coreclr\jit\compiler.cpp @ 5063]
0b 000000ba`d61cae00 00007ffb`1532ce48 clrjit!Compiler::compCompileHelper+0x6d8 [D:\a\_work\1\s\src\coreclr\jit\compiler.cpp @ 6729]
0c 000000ba`d61caec0 00007ffb`1532dbdc clrjit!Compiler::compCompile+0x4c8 [D:\a\_work\1\s\src\coreclr\jit\compiler.cpp @ 5861]
0d 000000ba`d61caf90 00007ffb`15335c58 clrjit!jitNativeCode+0x17c [D:\a\_work\1\s\src\coreclr\jit\compiler.cpp @ 7362]
0e 000000ba`d61cb180 00007ffb`0f0609b8 clrjit!CILJit::compileMethod+0x78 [D:\a\_work\1\s\src\coreclr\jit\ee_il_dll.cpp @ 279]
*** WARNING: Unable to verify checksum for C:\Users\parkerbibus\Desktop\TestPayload\Core_Root\coreclr.dll
0f (Inline Function) --------`-------- coreclr!invokeCompileMethodHelper+0x1f4 [D:\a\_work\1\s\src\coreclr\vm\jitinterface.cpp @ 12367]
10 (Inline Function) --------`-------- coreclr!invokeCompileMethod+0x250 [D:\a\_work\1\s\src\coreclr\vm\jitinterface.cpp @ 12430]
11 000000ba`d61cb1e0 00007ffb`0f0916dc coreclr!UnsafeJitFunction+0x10c8 [D:\a\_work\1\s\src\coreclr\vm\jitinterface.cpp @ 12904]
12 000000ba`d61cb600 00007ffb`0f0908dc coreclr!MethodDesc::JitCompileCodeLocked+0x1ec [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 946]
13 (Inline Function) --------`-------- coreclr!MethodDesc::JitCompileCodeLockedEventWrapper+0x850 [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 817]
14 (Inline Function) --------`-------- coreclr!MethodDesc::JitCompileCode+0xcfc [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 757]
15 (Inline Function) --------`-------- coreclr!MethodDesc::PrepareILBasedCode+0xf68 [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 426]
16 000000ba`d61cb7a0 00007ffb`0f092c18 coreclr!MethodDesc::PrepareCode+0xf94 [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 323]
17 (Inline Function) --------`-------- coreclr!CodeVersionManager::PublishVersionableCodeIfNecessary+0x29c [D:\a\_work\1\s\src\coreclr\vm\codeversion.cpp @ 1698]
18 000000ba`d61cba40 00007ffb`0f092714 coreclr!MethodDesc::DoPrestub+0x490 [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 2106]
19 000000ba`d61ce6a0 00007ffb`0efd11f4 coreclr!PreStubWorker+0x1c4 [D:\a\_work\1\s\src\coreclr\vm\prestub.cpp @ 1932]
1a 000000ba`d61ce790 00007ffb`0cd49200 coreclr!ThePreStub+0x50 [D:\a\_work\1\s\artifacts\obj\coreclr\windows.arm64.Release\vm\wks\AsmHelpers.asm @ 5028]
*** WARNING: Unable to verify checksum for C:\Users\parkerbibus\Desktop\TestPayload\Core_Root\System.Private.CoreLib.dll
1b 000000ba`d61ce8c0 00007ffb`0cd5fe84 System_Private_CoreLib+0x209200
1c 000000ba`d61ce8e0 00007ffb`0efd1f34 System_Private_CoreLib+0x21fe84
1d 000000ba`d61ce940 00007ffb`0f0e0a60 coreclr!CallDescrWorkerInternal+0x84 [D:\a\_work\1\s\artifacts\obj\coreclr\windows.arm64.Release\vm\wks\CallDescrWorkerARM64.asm @ 4539]
1e (Inline Function) --------`-------- coreclr!CallDescrWorkerWithHandler+0x34 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 67]
1f 000000ba`d61ce960 00007ffb`0f00f9c0 coreclr!MethodDescCallSite::CallTargetWorker+0x2e8 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp @ 568]
20 (Inline Function) --------`-------- coreclr!MethodDescCallSite::Call+0x14 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.h @ 458]
21 000000ba`d61cebb0 00007ffb`0f36a394 coreclr!CorHost2::CreateAppDomainWithManager+0x1f0 [D:\a\_work\1\s\src\coreclr\vm\corhost.cpp @ 630]
22 000000ba`d61cf000 00007ff6`107d5588 coreclr!coreclr_initialize+0x4c4 [D:\a\_work\1\s\src\coreclr\dlls\mscoree\exports.cpp @ 254]
23 000000ba`d61cf0e0 00007ff6`107d6778 corerun+0x5588
24 000000ba`d61cf670 00007ff6`107e65e0 corerun+0x6778
25 000000ba`d61cf780 00007ff6`107e6684 corerun!GetCurrentClrDetails+0xfe40
26 000000ba`d61cf7c0 00007ffb`744584a8 corerun!GetCurrentClrDetails+0xfee4
27 000000ba`d61cf7d0 00007ffb`74643068 KERNEL32!BaseThreadInitThunk+0x38
28 000000ba`d61cf810 00000000`00000000 ntdll!RtlUserThreadStart+0x48
Run with this error can be found here: https://dev.azure.com/dnceng/internal/_build/results?buildId=1765074&view=results, on either of the windows runs.
Reproduction Steps
Using the latest dotnet SDK on a Windows ARM64 machine:
- download a build of the latest corerun (One I tested with available here: https://pvscmdupload.blob.core.windows.net/drewtest/Core_Root_05_11.zip)
- Create a new console app
- Try to run the app with CoreRun.exe (%PathToCoreRun%/corerun.exe %PathToConsoleApp%/app.exe)
Expected behavior
App should print hello world.
Actual behavior
Failure occurs, printing:
[2022/05/09 06:29:58][INFO] Process terminated. Infinite recursion during resource lookup within System.Private.CoreLib. This may be a bug in System.Private.CoreLib, or potentially in certain extensibility points such as assembly resolve events or CultureInfo names. Resource name: Arg_AccessViolationException
[2022/05/09 06:29:58][INFO] at System.Environment.FailFast(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.InternalGetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.GetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.AccessViolationException..ctor()
[2022/05/09 06:29:58][INFO] at System.SpanHelpers.IndexOf(Char ByRef, Char, Int32)
[2022/05/09 06:29:58][INFO] at System.String.Ctor(Char*)
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureData.GetLocaleInfoEx(System.String, UInt32)
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.GetUserDefaultLocaleName()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo..cctor()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.get_CachedCulturesByName()
[2022/05/09 06:29:58][INFO] at System.Globalization.CultureInfo.GetCultureInfo(System.String)
[2022/05/09 06:29:58][INFO] at System.Resources.ManifestBasedResourceGroveler.GetNeutralResourcesLanguage(System.Reflection.Assembly, System.Resources.UltimateResourceFallbackLocation ByRef)
[2022/05/09 06:29:58][INFO] at System.Resources.ResourceManager.CommonAssemblyInit()
[2022/05/09 06:29:58][INFO] at System.SR.get_ResourceManager()
[2022/05/09 06:29:58][INFO] at System.SR.InternalGetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.SR.GetResourceString(System.String)
[2022/05/09 06:29:58][INFO] at System.AccessViolationException..ctor()
[2022/05/09 06:29:58][INFO] at System.SpanHelpers.IndexOf(Char ByRef, Char, Int32)
[2022/05/09 06:29:58][INFO] at System.String.Ctor(Char*)
[2022/05/09 06:29:58][INFO] at System.AppContext.Setup(Char**, Char**, Int32)
Regression?
This last worked with runtime commit 3e5517be and first failed with f8fa9f6d, this was made up of these PR’s from what I could tell: https://github.com/dotnet/runtime/pull/67771, https://github.com/dotnet/runtime/pull/69013, and https://github.com/dotnet/runtime/pull/68748.
Known Workarounds
No response
Configuration
This occurs on Windows ARM64 machines when running with the latest dotnet version. This does not seem to be an issue on non-Windows ARM64 configurations.
Other information
No response
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 31 (31 by maintainers)
TLDR
Because of C++ optimization, the code in
genLog2()
becomes functionally incorrect which leads to AV. It transforms the code:into
Details
Consider the
genRegNumFromMask()
method which invokesgenLog2()
method with value=0x100000000
.https://github.com/dotnet/runtime/blob/88fb9667ca71505ccdbcbdfe0dec1208873725b4/src/coreclr/jit/compiler.hpp#L521-L534
Pasting here for quick reference:
genLog2() and surrounding code
Here is the assembly code of the relevant portion:
Contents of hashTable https://github.com/dotnet/runtime/blob/88fb9667ca71505ccdbcbdfe0dec1208873725b4/src/coreclr/inc/bitposition.h#L34-L40
Next, we try to access
nextIntervalRef
with returned value fromgenLog2()
which is 0xffffff and resulting in AV. I will follow up with C++ team.