vulkano: Invalid queue handle on AMD GPUs using Mesa 23.3.3 and Mesa 24
Template
If you dont understand something just leave it. If you can provide more detailed information than the template allows for, please ignore the template and present all of your findings.
- Version of vulkano: 94f50f18bd25971ea123adb8b5782ad65a8f085c
- OS: NixOS using nixos-unstable
- GPU (the selected PhysicalDevice): AMD Radeon RX 6800 XT (RADV NAVI21)
- GPU Driver: Mesa 23.3.3
- Upload of a reasonably minimal complete
main.rsfile that demonstrates the issue: TODO
Issue
I and other users are experiencing segfaults when trying to run release builds of https://github.com/galister/wlx-overlay-s on AMD GPUs on Mesa 23.3.3 or Mesa 24.0.0.
Interestingly, it works fine with debug builds.
For Mesa 23.3.3 the segmentation fault occurs in vk_common_QueueSubmit (_queue=0x0, submitCount=1, pSubmits=0x7ffffafd5ce0, fence=0x0) at ../src/vulkan/runtime/vk_synchronization2.c:294 (mesa source) which shows that the queue handle is apparently 0x0.
After adding a simple debug message to wlx-overlay-s, we can confirm this difference in behavior between debug and release buids:
Debug:
[2024-02-10T23:10:14Z INFO wlx_overlay_s::graphics] build_and_execute_now queue: Some(Queue { handle: 0x5555577e4f60, device: 0x5555578356f0 (instance: 0x555557779f40), flags: empty(), queue_family_index: 0, id: 0, state: Mutex { data: QueueState } })
Release with debug info:
[2024-02-10T23:12:07Z INFO wlx_overlay_s::graphics] build_and_execute_now queue: Some(Queue { handle: 0x0, device: 0x55555656b4a0 (instance: 0x5555564af100), flags: empty(), queue_family_index: 0, id: 0, state: Mutex { data: QueueState } })
Cargo.toml:
...
[profile.release]
debug = true
Backtrace for cargo build --release using stripped Mesa 23.3.3:
#0 0x00007ffff5dca11a in vk_common_QueueSubmit () from /nix/store/j0wrj87k3q9r8r6nc7imcfg5phk9gyz9-mesa-23.3.3-drivers/lib/libvulkan_radeon.so
#1 0x0000555555e1b16e in vulkano::device::queue::QueueGuard::submit_unchecked (self=<optimized out>, submit_infos=..., fence=...) at src/device/queue.rs:1154
#2 0x0000555555e1dcaf in vulkano::device::queue::QueueGuard::submit (self=0x7fffffff3eb0, submit_infos=&[vulkano::command_buffer::SubmitInfo](size=1) = {...}, fence=...) at src/device/queue.rs:739
#3 vulkano::sync::future::queue_submit::{closure#0} (queue_guard=...) at src/sync/future/mod.rs:749
#4 vulkano::device::queue::Queue::with<core::result::Result<(), vulkano::Validated<vulkano::VulkanError>>, vulkano::sync::future::queue_submit::{closure_env#0}> (self=0x7fffffff4a60, func=...) at src/device/queue.rs:107
#5 vulkano::sync::future::queue_submit (queue=0x7fffffff4a60, submit_info=..., fence=..., future=...) at src/sync/future/mod.rs:749
#6 0x0000555555799f75 in vulkano::command_buffer::traits::{impl#2}::flush<vulkano::sync::future::now::NowFuture> (self=0x7fffffff4a50) at /home/scrumplex/.cargo/git/checkouts/vulkano-cb672043253a6e8d/94f50f1/vulkano/src/command_buffer/traits.rs:215
#7 0x00005555558ed5ad in wlx_overlay_s::graphics::WlxCommandBuffer::build_and_execute_now (self=...) at src/graphics.rs:939
#8 0x0000555555785364 in wlx_overlay_s::state::AppState::from_graphics (graphics=Arc(strong=1, weak=0) = {...}) at src/state.rs:35
#9 0x000055555577c957 in wlx_overlay_s::backend::openxr::openxr_run (running=...) at src/backend/openxr/mod.rs:59
#10 0x0000555555890b3c in wlx_overlay_s::auto_run (running=Arc(strong=3, weak=0) = {...}) at src/main.rs:61
#11 wlx_overlay_s::main () at src/main.rs:42
About this issue
- Original URL
- State: closed
- Created 5 months ago
- Comments: 16 (10 by maintainers)
Waiting for @Scrumplex to confirm. For me it never segfaults, so testing has been a bit of a pain.
I see that the safety contract of
Instance::from_handleis violated for the same reason: the create infos must match.I also see that you load the Vulkan library using both ash (
Entry::load) and vulkano (VulkanLibrary::new()). That’s going to result in 2 libraries being loaded, having different function pointers. You must instead only load the library on one side and pass thevkGetInstanceProcAddrfunction pointer when creating it on the other.I see that vulkano’s
DeviceCreateInfois missing thequeuesfield, unlike your ash create info. That’s against the safety contract ofDevice::from_handle, so that would be the first place I would look. Though I don’t know why it would lead to such a strange outcome.