bevy: STATUS_STACK_BUFFER_OVERRUN return code when running with dynamic linking using rust-lld on Windows 11

Bevy version

0.5 and main ( 07ed1d053e7946a116ce3eef273fc93dd246f49d )

Operating system & version

Win 11 (day1 release - 21H2)

What you did

Running cargo run a minimal hello.world bevy app with dynamic linking using rust-lld on Win11.

use bevy::prelude::*;

fn main() {
    App::build()
        .add_system(hello_world.system())
        .run();
}

fn hello_world() {
    println!("hello world!");
}

config.toml

[target.x86_64-pc-windows-msvc]
linker = "rust-lld.exe"
# rustflags = ["-Zshare-generics=n", "-Ccontrol-flow-guard=nochecks"]
rustflags = ["-Zshare-generics=n"]

What you expected to happen

The app runs with dynamic linking

What actually happened

The process ends immediately with

PS C:\Projects\GameDev\bevy\dynamic_test> cargo run
   Compiling dynamic_test v0.1.0 (C:\Projects\GameDev\bevy\dynamic_test)
    Finished dev [optimized + debuginfo] target(s) in 1.32s
     Running `target\debug\dynamic_test.exe`
error: process didn't exit successfully: `target\debug\dynamic_test.exe` (exit code: 0xc0000409, STATUS_STACK_BUFFER_OVERRUN)

Additional information

Either not using or rust-lld works. This compiler flag seems that it could be related, but adding -Ccontrol-flow-guard=nochecks to config.toml did not help.

Also possibly relevant: MS Docs on the MSVC flag /GS
MS Docs on Control Flow Guard

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 8
  • Comments: 32 (18 by maintainers)

Most upvoted comments

Hey, just wanted to let you know that it’s now working for me with bevy 9.1, rust-lld and rustc 1.66.

If anyone comes across this later, I am running on the nightly toolchain and am able to build and run with the dynamic feature enabled by only dropping the following into my .cargo\config.toml:

[target.x86_64-pc-windows-msvc]
rustflags = ["-Zshare-generics=n"]

Dynamic by itself was never an issue for me, only coupled with the lld linker.

Opened an issue on gilrs https://gitlab.com/gilrs-project/gilrs/-/issues/127 assuming they are open to it, and no one beats me to it, I should have time to work on it Friday.

Based on @RobertoMaurizzi’s findings that xInput1_3.dll can be successfuly used in place of 1.4 version, I’ve managed to work around this issue without having to edit any system files, or otherwise influece any other program:

  • copy your system xinput 1.3 (from e.g. C:\Windows\system32\xinput1_3.dll)
  • go to your build directory (your_game/target/debug or release) that contain your built game exe file
  • paste the dll (1.3 version) there and RENAME it to Xinput1_4.dll

The drawback of this is you will need to repeat this process any time you run cargo clean or remove your target directory. If this is an issue to you, feel free to automate it in build.rs.

This works, because windows DLL loader searches for DLLs in the executable directory first by default. That way you can effectively replace any system DLL that a program tries to load.

Fun fact: this is often used by various game mod loaders in order to hook into the game at startup without having to patch any executable file. One way of doing it is to place fake system DLL in your game directory (often xinput, thanks to it having small api surface), that passes all real function calls to the real system DLL, but additionally injects actual mod loader DLL to the game process by calling LoadLibrary inside DLLMain.

If anyone comes across this later, I am running on the nightly toolchain and am able to build and run with the dynamic feature enabled by only dropping the following into my .cargo\config.toml:

[target.x86_64-pc-windows-msvc]
rustflags = ["-Zshare-generics=n"]

I just tried this on bevy/main since that just got merged and this is now working for me while using the rust-lld linker on Windows 🎉 (was not working for me before without patching gilrs)

Note that you do still have to make sure to add this to your cargo.toml

# Enable a small amount of optimization in debug mode
[profile.dev]
opt-level = 1

# Enable high optimizations for dependencies (incl. Bevy), but not for our code:
[profile.dev.package."*"]
opt-level = 3

Without that I was getting rust-lld: error: too many exported symbols (got 243525, max 65535) which is reported here: https://github.com/bevyengine/bevy/issues/1110#issuecomment-1312926923

Knowing more details about what was the problem, I tried to search for people with similar issues on Google and…

I first found a post on the official Visual Studio support forum from somebody that had the exact same problem, crash in XInput1_4.dll when compiling an old project using clang-cl under Windows 11. See “Clang-cl tool chain results in dll loading issues for Xinput.lib”.

The interesting part of that post is that the author said he was creating an “input library” for XInput that was working on previous versions of XInput/Windows/VSC++ clang, and that apparently defining a function called DllMain somewhere in that input library will avoid the crash (probably by overshadowing the DllMain function inside XInput1_4.dll?). Is any of the input crates doing that, or using an ‘external’ input library for XInput? Microsoft recognized the problem but blamed it on “not Visual Studio”.

Then I continued searching for people having trouble with XInput1_4.dll and I didn’t find much other than people recommending to download it from fishy websites and copy it in Windows\System32… HOWEVER, one of the posts, about Chivalry 2, was a bit different: it recommended renaming XInput1_3.dll to XInput1_4.dll and that, if you did have an XInput1_4.dll and the game still didn’t work, you could simply delete/rename the file and copy the 1_3 version with the same name as the 1_4 version. I tried and guess what?

Program using the dynamic feature are now working! 🥳

The XInput1_4.dll file was NOT locked on my system, meaning that none of the programs normally running in my system was using it (so all the bunch of crap from Asus to manage the strange hardware is ok). Steam and Epic clients are ok. After renaming the files, everything is still Ok, including Unreal 5 and Blender and a few of the games I was able to try. Unfortunately they’re a bit old so they might use older versions of XInput anyway. The 2 only modern games I have installed (Elite Dangerous and Genshin Impact) want me to download multiple gigabytes of updates so I’m not able to test them before probably a week or more 😛 In case you find programs or games that crash after you replace the file, it should be possible to simply restore the real XInput1_4.dll when you need to run it (and copying the XInput1_3.dll over it when you need to develop with Bevy using dynamic).

Now I’m finally able to compile a small change in < 4 seconds instead of 90+ and this will definitely help me in learning Bevy and experimenting with my game learning project 😄

It still persists with 0.7. Everything runs perfectly without dynamic flag and if you add it immediately throws STATUS_STACK_BUFFER_OVERRUN.

That would be massively appreciated; better gamepad support on Windows + dynamic linking on Windows would both be incredible.

According to https://learn.microsoft.com/en-us/gaming/gdk/_content/gc/input/porting/input-porting-xinput#differences-between-xinput-and-xinputongameinput Xinput is considered legacy, maybe we just need to get upstream gilrs to use the newer “windows gaming input” by default? Would enable broader gamepad support out of the box with windows too. (Currently only Xbox controllers work out of the box with bevy on windows)

Yes please, upstream for the compiler team is the place for this. Make sure to link this thread.

That’s a great minimal example, thanks a ton.

The winapi crate depends on XInput1_4.dll if the xinput feature is enabled. According to cargo tree --target x86_64-pc-windows-msvc -e features | grep -C5 xinput only gilrs-core and it’s dependency rusty-xinput enable this feature.

https://github.com/retep998/winapi-rs/blob/2f76bdea3a79817ccfab496fbd1786d5a697387b/x86_64/def/xinput.def

    │               ├── gilrs-core feature "default"
    │               │   └── gilrs-core v0.4.1
    │               │       ├── uuid feature "default" (*)
    │               │       ├── log feature "default" (*)
    │               │       ├── winapi feature "default" (*)
    │               │       ├── winapi feature "xinput"
    │               │       │   └── winapi v0.3.9
    │               │       └── rusty-xinput feature "default"
    │               │           └── rusty-xinput v1.2.0
    │               │               ├── log v0.4.16 (*)
    │               │               ├── lazy_static feature "default" (*)
    │               │               ├── winapi feature "default" (*)
    │               │               ├── winapi feature "libloaderapi" (*)
    │               │               ├── winapi feature "winerror" (*)
    │               │               └── winapi feature "xinput" (*)
    │               └── vec_map feature "default"
    │                   └── vec_map v0.8.2
    ├── bevy_gltf feature "default"
    │   └── bevy_gltf v0.9.0-dev (/home/bjorn/Projects/bevy/crates/bevy_gltf)
    │       ├── anyhow feature "default" (*)

So I was chasing the same issue for a while, and I managed to get to the bottom of it. Please see: https://github.com/llvm/llvm-project/issues/82050

I was able to test disabling gilrs and it works for me too (best part is that gilrs at least in this W11 system is unable to detect a controller that works with both Steam Big Screen and the Web Controller APIs in Chrome). Excluding a default feature could be a bit less annoying to configure, but well, you do it only once.

The “funny Windows thing of today” is that both Steam and the Nvidia share assistant decided they now need to use XInput1_4.dll so I had to kill a few things to copy back the real DLL file in its place. Thanks SysInternals for Process Explorer.

What a great find!

I can confirm the problem is indeed with gilrs since it links to Xinput1_4.dll and for some unknown reason, when linking XInput1_4.dll using rust-lld the program crashes when calling DllMain of XInput1_4.dll.

image XINPUT1_4.DLL is listed as dependency of BEVY_DYLIB.DLL

So if one can also disable bevy/gilrs (if GamePads isn’t being used) and bevy_dylib.dll won’t load XInput1_4.dll anymore:

image No more XINPUT1_4.DLL

Sadly I wasn’t able to find which crate is requiring XInput1_4.dll to be loaded, since rusty-xinput loads it dynamically. so it must be somewhere else.

I tried to do what @aka-bash0r did and copied the required DLLs around (I copied the debug version of bevy_dylib-<number>.dll from my directory and the Rust std library DLL from ~/.rustup/toolchains/nightly-x86_64-pc-windows-msvc/lib/rustlib/x86_64-pc-windows-msvc/lib/std-8a6b5a658b168ce9.dll (nightly from6dbae3ad1 2022-07-25). After copying the 2 DLLs to C:\Windows\System32, running the exe from the command line doesn’t crash but it exits without doing anything.

Opening it in Visual Studio 2022 CE, it stops in some C/C++ glue (?) that, from what I’ve understood (… but I might be wrong) loads the instance of bevy_dylib.dll, as possibly described by the stack call window entry:

bevy_dylib-b4a908a5e3f80433.dll!dllmain_dispatch(HINSTANCE__ * const instance, const unsigned long reason, void * const reserved) Line 281
	at d:\a01\_work\43\s\src\vctools\crt\vcstartup\src\startup\dll_dllmain.cpp(281)

After that (point 0) there are 2 other calls (1 and 2) on the stack, without source since it’s MS code: it calls into XInput.dll (that should be DirectX’s DirectInput implementation) for 2 stack frames, then crashes:

  1. call to bevy_dylib-b4a908a5e3f80433.dll!dllmain_dispatch
  2. call to XInput1_4.dll!DllMain()
  3. call to XInput1_4.dll!RegisterUtcEventProvider(void) that then crashes.

Checking the assembly code since there aren’t sources available from Microsoft, the crash happens after a je instruction (jump equal) to another point in this RegisterUtcEventProvider function: since it fails, load a 5 in register ecx then executes an int 29h, that is our friend __fastfail that, called with 5, means FAST_FAIL_INVALID_ARG (from winnt.h from what I was able to find online). The tests before the je are, if anybody can understand this better than me:

00007FF87E7F7A3C  push        rbx  
00007FF87E7F7A3E  sub         rsp,40h  
00007FF87E7F7A42  mov         rax,qword ptr [__security_cookie (07FF87E7FC068h)]  
00007FF87E7F7A49  xor         rax,rsp  
00007FF87E7F7A4C  mov         qword ptr [rsp+30h],rax  
00007FF87E7F7A51  xor         edx,edx  
00007FF87E7F7A53  lea         rcx,[g_VidPids (07FF87E7FC650h)]  
00007FF87E7F7A5A  lea         r8d,[rdx+40h]  
00007FF87E7F7A5E  call        memset (07FF87E7F82A9h)  
00007FF87E7F7A63  cmp         qword ptr [WPP_GLOBAL_Control+28h (07FF87E7FC038h)],0  
00007FF87E7F7A6B  mov         rax,qword ptr [WPP_GLOBAL_Control+10h (07FF87E7FC020h)]  
00007FF87E7F7A72  movups      xmm0,xmmword ptr [rax-10h]  
00007FF87E7F7A76  movdqu      xmmword ptr [rsp+20h],xmm0  
00007FF87E7F7A7C  je          RegisterUtcEventProvider+49h (07FF87E7F7A85h)  
00007FF87E7F7A7E  mov         ecx,5  
00007FF87E7F7A83  int         29h  

The debugger’s output windows shows (as could be expected, given the int 29h call with 5): Unhandled exception at 0x00007FF87E7F7A83 (XInput1_4.dll) in fftf.exe: An invalid parameter was passed to a function that considers invalid parameters fatal.

HOWEVER: this is at least a bit different than what happens when run with cargo, since without debugger I don’t get the exception printed out, so the reason might be different (some other configuration I’d need to perform manually).

QUESTION: where can I read how a cargo run sets up the execution environment for a Windows executable? I wasn’t able to find it with a quick search.

IDEAS: can we try to disable XInput somehow? Check if there are newer crates/bugs filed in… gilrs? gilrs-core? It might also be in other types of input however (touch?) or unrelated functions inside the DLL (from https://docs.rs/rusty-xinput/1.2.0/src/rusty_xinput/lib.rs.html we can see it has functions like XInputGetBatteryInformationFunc or XInputGetAudioDeviceIdsFunc …)

You will either need to add target/debug/deps and the output of rustc --print target-libdir to your PATH variable or copy bevy_dylib-d25f65632eac054f.dll and std-0f7e1853181d29c2.dll from the aformentioned directories to target/debug next to the executable. Otherwise Windows doesn’t know where to find them. When you use cargo run adds then to the PATH for you, but this doesn’t work if you run it without going through cargo.