VulkanMemoryAllocator: vmaCreateBuffer is prohibitively slow when bufferImageGranularity > 1
On devices that have bufferImageGranularity
set to 1, we observe that vmaCreateBuffer
can be two orders of magnitude faster than with devices that have bufferImageGranularity
set to a higher value. In one extreme example, we have an asset that loads in less than 1 second on a device with bufferImageGranularity=1 and loads in 1.5 minutes on a device with bufferImageGranularity=4096.
Capturing a trace of the CPU work reveals that the CheckAllocation
function is called repeatedly in the slow case.
Can this slow path be avoided by tweaking some of the configuration, or using a custom pool?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 28 (19 by maintainers)
I’ve tested this in the Godot engine and there’s a clear improvement: https://github.com/godotengine/godot/pull/51524
Yes, it might be a good approach, as long you don’t use linear algorithm for those pools, because as I said before, this algorithm is not good for random allocations, it may grow indefinitely in size.
I’ve done some additional research and brought a new approach to the same optimizations here in https://github.com/godotengine/godot/pull/57989.
TL;DR There I’ve used the latest
vk_mem_alloc.h
with a patch so that non-default pools can have mixed memory types, just like the default one, which allows the user code to use a separate pool for small objects without having to care about memory types (no need to create as many pools as memory flags combinations in user code, which also are hard to collect). I’m also saying there that if the maintainers here find such a thing useful I can make a flexible version of it in which the pool being “universal” is opt-in via a flag, keeping the current behavior of memory type specific pools the default.