runtime: Reloc failures with NativeAOT on Apple Silicon
I am trying to enable NativeAOT on OSX arm64. With this patch https://github.com/dotnet/runtime/compare/main...am11:feature/nativeaot/osx-arm64 (tested with both @GOTPAGE
and @PAGE
assembler directives), it builds the nupkg. Consuming that package results in the following errors during the ilc
step:
# with `<add key="TestSource" value="/Users/am11/projects/runtime/artifacts/packages/Release/Shipping" />`
# in NuGet.config
$ dotnet nuget locals all --clear && rm -rf obj bin && dotnet publish --use-current-runtime -v:diag ...
... snip ...
21:06:05.007 1:7>Target "IlcCompile: (TargetId:181)" in file "/Users/am11/.nuget/packages/microsoft.dotnet.ilcompiler/7.0.0-dev/build/Microsoft.NETCore.Native.targets" from project "/Users/am11/projects/naot1/naot1.csproj" (target "LinkNative" depends on it):
Building target "IlcCompile" completely.
Output file "obj/release/net7.0/osx-arm64/native/naot1.o" does not exist.
Task "Message" skipped, due to false condition; ($(_BuildingInCompatibleMode) != 'true') was evaluated as (true != 'true').
Task "Message" (TaskId:126)
Task Parameter:Text=Generating compatible native code. To optimize for size or speed, visit https://aka.ms/OptimizeCoreRT (TaskId:126)
Task Parameter:Importance=high (TaskId:126)
Generating compatible native code. To optimize for size or speed, visit https://aka.ms/OptimizeCoreRT (TaskId:126)
Done executing task "Message". (TaskId:126)
Task "Exec" (TaskId:127)
Task Parameter:Command="/Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/tools/ilc" @"obj/release/net7.0/osx-arm64/native/naot1.ilc.rsp" (TaskId:127)
"/Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/tools/ilc" @"obj/release/net7.0/osx-arm64/native/naot1.ilc.rsp" (TaskId:127)
<unknown>:0: error: ADR/ADRP relocations must be GOT relative (TaskId:127)
<unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
<unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
<unknown>:0: error: fixup value out of range (TaskId:127)
<unknown>:0: error: ADR/ADRP relocations must be GOT relative (TaskId:127)
<unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
<unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
<unknown>:0: error: fixup value out of range (TaskId:127)
... repeats 1000s of times ...
somewhere after the objwriter has succeeded: https://github.com/dotnet/runtime/blob/071e772d9d3bd8b50a5380bce6214277a1e61c98/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ObjectWriter.cs#L1183 and before the clang
command is executed. While the ilc task does not fail, MSBuild fails on the clang step:
Set Property: _IgnoreLinkerWarnings=false
Set Property: _IgnoreLinkerWarnings=true
Task "Exec" (TaskId:129)
Task Parameter:IgnoreStandardErrorWarningFormat=True (TaskId:129)
Task Parameter:Command=clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS (TaskId:129)
clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS (TaskId:129)
ld: malformed __LD,__compact_unwind section, bad length file 'obj/release/net7.0/osx-arm64/native/naot1.o' (TaskId:129)
clang: error: linker command failed with exit code 1 (use -v to see invocation) (TaskId:129)
21:06:12.873 1:7>/Users/am11/.nuget/packages/microsoft.dotnet.ilcompiler/7.0.0-dev/build/Microsoft.NETCore.Native.targets(337,5): error MSB3073: The command "clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS" exited with code 1. [/Users/am11/projects/naot1/naot1.csproj]
Done executing task "Exec" -- FAILED. (TaskId:129)
21:06:12.873 1:7>Done building target "LinkNative" in project "naot1.csproj" -- FAILED.: (TargetId:182)
With objdump, that __LD,__compact_unwind
section looks like:
Disassembly of section __LD,__compact_unwind:
00000000003b2858 <ltmp8>:
3b2858: 40 4b 00 00 udf #19264
3b285c: 00 00 00 00 udf #0
3b2860: 74 00 00 00 udf #116
3b2864: 00 00 00 03 <unknown>
...
3b2874: c0 4b 00 00 udf #19392
3b2878: 00 00 00 00 udf #0
3b287c: 74 00 00 00 udf #116
3b2880: 00 00 00 03 <unknown>
...
3b2890: 40 4c 00 00 udf #19520
3b2894: 00 00 00 00 udf #0
3b2898: 74 00 00 00 udf #116
3b289c: 00 00 00 03 <unknown>
...
3b28ac: c0 4c 00 00 udf #19648
3b28b0: 00 00 00 00 udf #0
3b28b4: 74 00 00 00 udf #116
3b28b8: 00 00 00 03 <unknown>
... repeats ...
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 4
- Comments: 103 (103 by maintainers)
Pushed a commit to objwriter (https://github.com/dotnet/llvm-project/pull/185/commits/7280b550bb8ac3cce72a1ee288dc744b1ab9b1b6) which fixes type 15 error. After that
dotnet publish
produced the binary successfully but it does not printHello World!
yet. đYou may be seeing #75298
Yep, that matches what I get. Not every time though.
this assertion is failing:
Disabling
FEATURE_USE_SOFTWARE_WRITE_WATCH_FOR_GC_HEAP
fixed it and a few others.I got way further (eg. printing âHello Worldâ works) but exceptions failed to unwind. Thatâs why I started poking into it.
Right, it says to subtract an address from where the reloc is pointing to. It should be exactly what we need here.
Here is an example how LLVM creates relative relocs using the subtractor: https://github.com/llvm/llvm-project/blob/6c9f6812523a706c11a12e6cb4119b0cf67bbb21/lld/MachO/EhFrame.cpp#L108-L130
Ok, looks like we have a case of a wrong relocation. When weâre in
GCHandle__get_IsAllocated
,this
inx0
is0x00000001004c3618
, but from the memory map you posted above, that address is part ofnaot1.__DATA_CONST.__got
. So weâre in global offset table, but the code doesnât expect having to dereference the pointer (GOT table is a table of indirections).So object writer generated a GOT relocation for something that should just store the address of the destination directly. GOT requires an extra dereference to access what the reloc points to.
This might be responsible for the problem: https://github.com/dotnet/llvm-project/pull/185/files#diff-3dd92a728cef8bf36a3e8104cbfcf2b9b901abff46340d48f5cf0a02d3274a2aR455-R456
That looks related to the TLS access. Weâre here:
https://github.com/dotnet/runtime/blob/9b2e2a830a4e2e67c920aa200329533baba5c363/src/coreclr/nativeaot/Runtime/arm64/AllocFast.S#L193-L213
My suspicion is that
INLINE_GETTHREAD
loaded a bogus address into x3. Itâs supposed to load thetls_CurrentThread
thread-local static.https://github.com/dotnet/runtime/blob/9b2e2a830a4e2e67c920aa200329533baba5c363/src/coreclr/nativeaot/Runtime/unix/unixasmmacrosarm64.inc#L215-L217
I would put a breakpoint here:
https://github.com/dotnet/runtime/blob/9b2e2a830a4e2e67c920aa200329533baba5c363/src/coreclr/nativeaot/Runtime/threadstore.inl#L6-L10
and see what value the variable has (and how the compiler got to it in assembly). Then compare with what INLINE_GETTHREAD came up with. (Make sure youâre looking at the same thread, we already have the finalizer thread running at this point in startup).
Digging in Appleâs source code, the error seems to be:
https://github.com/apple-oss-distributions/ld64/blob/dbf8f7feb5579761f1623b004bd468bdea7c6225/src/ld/parsers/macho_relocatable_file.cpp#L5631-L5632
the size of __compact_unwind section is not divisible by the size of the unwind entry. Thatâs odd because we donât generate the Apple weird thing, we generate DWARF CFI.
However, looking at LLVM source code, I think this is kicking in:
https://github.com/dotnet/llvm-project/blob/f1120a92d05f1c57e75af7d16504012570ef3409/llvm/lib/MC/MCObjectFileInfo.cpp#L30-L32
And LLVM does generate something on our behalf. Probably broken from the sound of it.
I would dig around that - can we still make do without a __compact_unwind section? If itâs not present in the executable, maybe ld would still do the right thing and convert it from CFI to the compact unwind scheme for us.
Maybe the right thing would be to start generating compact unwinding because Apple tends to unceremoniously cut off things they donât like anymore after a couple years of supporting both the thing they stopped liking and the new shiny thing. Unwinding codes are currently generated in RyuJIT.
Ah, so those messages are still generated by the object writer in ILC.
For example here: https://github.com/dotnet/llvm-project/blob/f1120a92d05f1c57e75af7d16504012570ef3409/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MachObjectWriter.cpp#L102-L103.
Looks like we need to decide what kind of relocation to generate when weâre generating it.