renderdoc: VK_ERROR_DEVICE_LOST in ApplyInitialContents and on opening Texture or Mesh Viewer

Description

After capturing a frame from the 3d_scene example of Bevy (specific commit), renderdoc logs VK_ERROR_DEVICE_LOST errors in it’s diagnostic log, fails to show the texture view and hangs with Please wait, working... if I try to inspect a different event from the Event Browser, or some other things.

This screenshot shows where renderdoc really hangs, after this.

image

This is renderdocs diagnostic log from renderdoc before I cause it to hang.

VK_ERROR_DEVICE_LOST.txt

To try to get a better idea of what’s happening I compiled the latest renderdoc from git (commit cd5d0ede440aff1fcb44121a21c088b77ec64285) and tried it again with that, with the same results. Visual Studio (2019, because the right Windows SDK for 2015 isn’t on MS’ site anymore) gives me this stack when the first exception occurs (on running the application directly):

>	renderdoc.dll!WrappedVulkan::FlushQ() Line 351	C++
 	renderdoc.dll!WrappedVulkan::ApplyInitialContents() Line 2785	C++
 	renderdoc.dll!WrappedVulkan::ReplayLog(unsigned int startEventID, unsigned int endEventID, ReplayLogType replayType) Line 3434	C++
 	renderdoc.dll!VulkanReplay::ReplayLog(unsigned int endEventID, ReplayLogType replayType) Line 204	C++
 	renderdoc.dll!ReplayController::SetFrameEvent(unsigned int eventId, bool force) Line 81	C++
 	qrenderdoc.exe!CaptureContext::SetEventID::__l2::<lambda>(IReplayController * r) Line 1507	C++
 	[External Code]	
 	qrenderdoc.exe!ReplayManager::run(int proxyRenderer, const QString & capturefile, const ReplayOptions & opts, std::function<void __cdecl(float)> progress) Line 496	C++
 	qrenderdoc.exe!ReplayManager::OpenCapture::__l2::<lambda>() Line 56	C++
 	[External Code]	
 	qrenderdoc.exe!LambdaThread::process() Line 345	C++
 	qrenderdoc.exe!QtPrivate::FunctorCall<QtPrivate::IndexesList<>,QtPrivate::List<>,void,void (__cdecl LambdaThread::*)(void) __ptr64>::call(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 136	C++
 	qrenderdoc.exe!QtPrivate::FunctionPointer<void (__cdecl LambdaThread::*)(void) __ptr64>::call<QtPrivate::List<>,void>(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 170	C++
 	qrenderdoc.exe!QtPrivate::QSlotObject<void (__cdecl LambdaThread::*)(void) __ptr64,QtPrivate::List<>,void>::impl(int which, QtPrivate::QSlotObjectBase * this_, QObject * r, void * * a, bool * ret) Line 121	C++
 	[External Code]	

and this on loading a saved capture:

>	renderdoc.dll!WrappedVulkan::SubmitCmds(VkSemaphore_T * * unwrappedWaitSemaphores, unsigned int * waitStageMask, unsigned int waitSemaphoreCount) Line 290	C++
 	renderdoc.dll!WrappedVulkan::AddFrameTerminator(unsigned __int64 queueMarkerTag) Line 3409	C++
 	renderdoc.dll!WrappedVulkan::ContextReplayLog(CaptureState readType, unsigned int startEventID, unsigned int endEventID, bool partial) Line 2676	C++
 	renderdoc.dll!WrappedVulkan::ReadLogInitialisation(RDCFile * rdc, bool storeStructuredBuffers) Line 2438	C++
 	renderdoc.dll!VulkanReplay::ReadLogInitialisation(RDCFile * rdc, bool storeStructuredBuffers) Line 199	C++
 	renderdoc.dll!ReplayController::PostCreateInit(IReplayDriver * device, RDCFile * rdc) Line 2042	C++
 	renderdoc.dll!ReplayController::CreateDevice(RDCFile * rdc, const ReplayOptions & opts) Line 2009	C++
 	renderdoc.dll!CaptureFile::OpenCapture(const ReplayOptions & opts, std::function<void __cdecl(float)> progress) Line 364	C++
 	qrenderdoc.exe!ReplayManager::run(int proxyRenderer, const QString & capturefile, const ReplayOptions & opts, std::function<void __cdecl(float)> progress) Line 450	C++
 	qrenderdoc.exe!ReplayManager::OpenCapture::__l2::<lambda>() Line 56	C++
 	[External Code]	
 	qrenderdoc.exe!LambdaThread::process() Line 345	C++
 	qrenderdoc.exe!QtPrivate::FunctorCall<QtPrivate::IndexesList<>,QtPrivate::List<>,void,void (__cdecl LambdaThread::*)(void) __ptr64>::call(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 136	C++
 	qrenderdoc.exe!QtPrivate::FunctionPointer<void (__cdecl LambdaThread::*)(void) __ptr64>::call<QtPrivate::List<>,void>(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 170	C++
 	qrenderdoc.exe!QtPrivate::QSlotObject<void (__cdecl LambdaThread::*)(void) __ptr64,QtPrivate::List<>,void>::impl(int which, QtPrivate::QSlotObjectBase * this_, QObject * r, void * * a, bool * ret) Line 121	C++
 	[External Code]	

followed by

>	renderdoc.dll!WrappedVulkan::SubmitCmds(VkSemaphore_T * * unwrappedWaitSemaphores, unsigned int * waitStageMask, unsigned int waitSemaphoreCount) Line 290	C++
 	renderdoc.dll!WrappedVulkan::ApplyInitialContents() Line 2782	C++
 	renderdoc.dll!WrappedVulkan::ReplayLog(unsigned int startEventID, unsigned int endEventID, ReplayLogType replayType) Line 3434	C++
 	renderdoc.dll!VulkanReplay::ReplayLog(unsigned int endEventID, ReplayLogType replayType) Line 204	C++
 	renderdoc.dll!ReplayController::SetFrameEvent(unsigned int eventId, bool force) Line 81	C++
 	qrenderdoc.exe!CaptureContext::SetEventID::__l2::<lambda>(IReplayController * r) Line 1507	C++
 	[External Code]	
 	qrenderdoc.exe!ReplayManager::run(int proxyRenderer, const QString & capturefile, const ReplayOptions & opts, std::function<void __cdecl(float)> progress) Line 496	C++
 	qrenderdoc.exe!ReplayManager::OpenCapture::__l2::<lambda>() Line 56	C++
 	[External Code]	
 	qrenderdoc.exe!LambdaThread::process() Line 345	C++
 	qrenderdoc.exe!QtPrivate::FunctorCall<QtPrivate::IndexesList<>,QtPrivate::List<>,void,void (__cdecl LambdaThread::*)(void) __ptr64>::call(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 136	C++
 	qrenderdoc.exe!QtPrivate::FunctionPointer<void (__cdecl LambdaThread::*)(void) __ptr64>::call<QtPrivate::List<>,void>(void(LambdaThread::*)() f, LambdaThread * o, void * * arg) Line 170	C++
 	qrenderdoc.exe!QtPrivate::QSlotObject<void (__cdecl LambdaThread::*)(void) __ptr64,QtPrivate::List<>,void>::impl(int which, QtPrivate::QSlotObjectBase * this_, QObject * r, void * * a, bool * ret) Line 121	C++
 	[External Code]	

followed by more (a bit too many to paste). After a number of exceptions renderdoc starts running without exception again, until I do something to trigger the actual hang, such as clicking a drawcall in the event browser, after which renderdoc hangs without exception.

I suspect there is a problem with the application I’m capturing from, but I was not expecting renderdoc to hang.

Steps to reproduce

Here is a capture from the application that reproduces the error every time for me (VK_ERROR_DEVICE_LOST.rdc).

The other file (no_VK_ERROR_DEVICE_LOST.rdc) is a capture from the last commit on bevy main (commit b6be8a5314e027a0b0f3ee48d04c14b52fe74676) that doesn’t exhibit this problem. Putting the offending commit (by my hand) at 45b2db70705da24a89426e6b6e77d603a3983025 if that might help in some way.

This is the renderdoc diagnostic log for this second capture.

no_VK_ERROR_DEVICE_LOST.txt

Alternatively, with rust installed, one could clone https://github.com/bevyengine/bevy, checkout f520a341d5737600dbf89015b7729109d67cf041 (the HEAD of main at the time of writing) and build the application with cargo build --example 3d_scene. It can then be launched from target\debug\examples\3d_scene.exe with the root of the project as working dir and CARGO_MANIFEST_DIR=<path to working dir>.

I can also send over a compiled executable if desired.

Environment

  • RenderDoc version: v1.13 and cd5d0ede440aff1fcb44121a21c088b77ec64285 (the latter built using Visual Studio 2019)
  • Operating System: Windows 10 Education 19042.867
  • Graphics API: Vulkan

gpu-z

Edit: The issue seems similar to #2216. I’ve tried to include as much detail as I can, but I will provide what other details I can if asked.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18 (8 by maintainers)

Most upvoted comments

Thanks, that showed the issue immediately. It was related to the Unexpected descriptor type errors before - vulkan has a little feature that allows you to overrun descriptor writes from one binding to another as long as the type is compatible (in this case all buffers). I had implemented that but only assuming tightly-packed bindings, i.e. binding 0 rolling over into binding 1. In your case the bindings were sparse so instead of 0 rolling over into 3 it rolled over into where 1 should be and threw that error - also not properly then recording the results of the update. I guess this feature is rarely used so no-one had run into this before.

That commit should fix the rollover behaviour to properly advance to the next binding.

I just built the latest commit and can confirm the fix works, also for other cases using Bevy.