runtime: Regression: Bus error when running PublishSingleFile=true .NET 6.0 app on linux-arm (Raspbian)
Description
Hello,
In original issue https://github.com/JustArchiNET/ArchiSteamFarm/issues/2457 I’m dealing with a regression that caused single-file publised app crash during initialization with Bus error
(so to the best of my knowledge kernel sending SIGBUS
to the process).
This issue did not happen with .NET 5.0 runtime, therefore I classify it as a regression.
<username>@<hostname>:~/ArchiSteamFarm $ ./ArchiSteamFarm
Bus error
Reproduction Steps
It’s very hard for me to give reproduction steps as I’m unable to reproduce this myself. The issue is specific to one user (albeit he claims that he has tried at least 2 different machines and got the same result).
The minimal repro I have right now is cloning my project git clone https://github.com/JustArchiNET/ArchiSteamFarm.git
and checking out 876c3324526d0fe6b0a801210b63f663a4eb816c
commit. The minimal build instructions I managed to pull it with was:
dotnet publish ArchiSteamFarm -c Release -o out -r linux-arm /p:PublishSingleFile=true --self-contained
Precompiled build is also available for download: https://github.com/JustArchiNET/ArchiSteamFarm/releases/download/5.2.0.9/ASF-linux-arm.zip
Expected behavior
The app works as previously, initializes properly and executes code.
Actual behavior
The app crashes with Bus error
(so to the best of my knowledge kernel sending SIGBUS
to the process). This happens before initialization of my app takes place (first line logged to the console), so it’s likely something related to decompression in-memory process of the single-file app.
I’ve asked the user to record COREHOST_TRACE=1
, this was the output it recorded before crashing:
Tracing enabled @ Thu Dec 2 10:52:21 2021 GMT
--- Invoked apphost [version: static, commit hash: static] main = {
./ArchiSteamFarm
}
The managed DLL bound to this executable is: 'ArchiSteamFarm.dll'
Detected Single-File app bundle
Using internal fxr
Invoking fx resolver [/home/pi/ArchiS2/] hostfxr_main_bundle_startupinfo
Host path: [/home/pi/ArchiS2/ArchiSteamFarm]
Dotnet path: [/home/pi/ArchiS2/]
App path: [/home/pi/ArchiS2/ArchiSteamFarm.dll]
Bundle Header Offset: [18f0a400]
--- Invoked hostfxr_main_bundle_startupinfo [commit hash: static]
Mapped application bundle
Unmapped application bundle
Single-File bundle details:
DepsJson Offset:[1d938] Size[61fa2b8]
RuntimeConfigJson Offset:[2b0] Size[75b0f0]
.net core 3 compatibility mode: [No]
--- Executing in a native executable mode...
Using dotnet root path [/home/pi/ArchiS2/]
App runtimeconfig.json from [/home/pi/ArchiS2/ArchiSteamFarm.dll]
Runtime config is cfg=/home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.json dev=/home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.dev.json
Attempting to read runtime config: /home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.json
Attempting to read dev runtime config: /home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.dev.json
Mapped bundle for [/home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.json]
Unmapped application bundle
Runtime config [/home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.json] is valid=[1]
Executing as a self-contained app as per config file [/home/pi/ArchiS2/ArchiSteamFarm.runtimeconfig.json]
Using internal hostpolicy
Reading from host interface version: [0x16041101:124] to initialize policy version: [0x16041101:124]
Mapped application bundle
Sadly not very informative to me.
Regression?
Yes, single-file publish of this particular app worked fine in .NET 5.0. Single-file publish also works fine in .NET 6.0 for all other OS targets (e.g. linux-arm64
, linux-x64
, win-x64
), it’s also not reproducible even in all linux-arm
setups, I didn’t receive such error from other users, and we’ve tried to reproduce it ourselves.
According to the user this happens on 2 different machines (albeit similar), this decreases the chance of some kind of hardware malfunction or similar.
Known Workarounds
I’d be very happy if you could suggest any. I’m trying various things that come to my mind in original issue at https://github.com/JustArchiNET/ArchiSteamFarm/issues/2457 and the only thing that actually made it work (at least for now) was PublishSingleFile=false
.
Right now I’m testing with the user if IncludeNativeLibrariesForSelfExtract=true
or IncludeAllContentForSelfExtract=true
helps with this issue.
Is there any way to force through environment variable old-style method of self-extraction single-file published app? The one that doesn’t involve switches during compilation, if that worked it’d be decent enough workaround for me to suggest for users dealing with this issue in our linux-arm
builds while this issue is investigated.
Configuration
Host machine: Raspberry Pi linux-arm (raspbian.10-arm, kernel 5.10.63-v8+)
Last working (tested) runtime: .NET 5.0.11. First not-working (tested) runtime: .NET 6.0.0
The issue is specific to that configuration, I could not reproduce this on my linux-arm64 Raspberry Pi 4 machine.
Other information
Please let me know what else I can provide/do to help narrow this one down. I had no luck reproducing it myself on any of my machines, but I strongly believe this is a regression in regards to .NET 5.0. Perhaps one of you will be able to reproduce this problem by running my app on Raspberry Pi (Raspbian) linux-arm OS and therefore gather more info required to fix the problem.
I’m trying to actively work with the user to provide more info in regards to this, you can find our conversation here: https://github.com/JustArchiNET/ArchiSteamFarm/issues/2457
Thank you in advance for your interest in regards to this issue.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 22 (22 by maintainers)
@JustArchi the workaround for the issue until it is fixed is to execute the following on the affected devices:
This makes the kernel handle the unaligned accesses and make apps work fine (only a tiny bit slower due to the trap to kernel on each unaligned access ).
Looking at the dump on my RPI4, it is really a misaligned access:
ldrd
instruction requires addresses aligned to 8 bytes.Unaligned access handling can be set on Linux as described in https://mjmwired.net/kernel/Documentation/arm/mem_alignment. There are three options - the kernel handles it, but prints a warning message, the kernel handles it silently or the kernel generates SIGBUS.
I was able to easily repro the crash after issuing this command on my RPI4 (without any docker container)
Since this is specific to a particular environment, it could be hard to reproduce. I wonder what additional diagnostics we can get from the user.
Perhaps a core dump could be helpful, if available?