vk_video_samples: Numerous segfaults/asserts on mesa 23.2 with ANV_VIDEO_DECODE

Intel hardware has some initial support for vulkan video decoding but this application appears to fail in numerous ways when attempting to run it on linux+mesa. Most of these appear to be bugs in the sample and not

  1. Configure the program for ANV’s queue families and queue count
  2. Attempt to run the sample with a 10s clip of Big Buck Bunny
  3. Crash.

Trying to get the sample working it seems the fixes/workarounds required were

  1. Only MapMemorywhen using generateColorPatternRgba888 in vk_video_decoder/libs/VkCodecUtils/VulkanVideoUtils.cpp ImageObject::FillImageWithPattern, because vkFillYuv.fillVkImage also attempts map. Mapping twice is invalid by spec and fails on mesa.
  2. Fix dependency on undefined behavior in Shell::AcquireBackBuffer where assert(acquireBuf != nullptr); attempts to check if the backbuffer queue was empty. So you get more reasonable crashes.
  3. disable non-fifo present modes in vk_video_decoder/libs/VkShell/Shell.cpp Shell::ResizeSwapchain, the code appears to depend on AcquireNextImage blocking so if other modes are present it will overrun the backbuffer queue and hang or crash. (there is a vsync option in the config but this doesnt appear to affect decode presentation).
  4. Reduce DPB size checking in vk_video_decoder/libs/NvVideoParser/include/VulkanH264Decoder.h dpb_full() since the newly allocated reference frame has state == 0 it is not counted here despite being counted in the dpb’s accounting. The next decode step will then assert that the DPB is out of slots and crash when it attempts to allocate another reference frame.

That should cover everything I needed to get the sample decoder running successfully. It should be noted that incorrect handling of ycbcr sampler attachment, synchronization of cmdbufs, image ownership transitions (due to separate video and present queues), and more result in a large amount of validator noise but the frames presented appear more or less correct.

Thanks for writing up the sample as well it was convenient to have something to test the new functionality with.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 20 (13 by maintainers)

Commits related to this issue

Most upvoted comments

@zlatinski I think the image count calcs are wrong.

caps.maxImageCount can legally be 0, so you shouldn’t limit things to max if it’s 0.