webrender: PBO copies are slow on Angle
It appears that Windows/Angle spends tons of time in update_texture_from_pbo: https://perfht.ml/2zJnkg4
Angle decided to defer mapping and filling an actual D3D11 staging buffer up until this call. Since the copy itself is done in a draw call, it has to stall the GPU until the copy is done. This is one issue, but the actual current one is different: Map itself is waiting. I suppose it’s waiting for the GPU considering the texture is still in use, which implies our PBO orphaning doesn’t work as expected (see Device::orphan_pbo).
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (11 by maintainers)
Commits related to this issue
- Auto merge of #2147 - kvark:pbo-style, r=glennw Texture update strategies ~~Fixes #2110, hopefully: still to be benchmarked on Windows.~~ Provide the settings for Gecko to fix ^^ Try push: https:/... — committed to servo/webrender by deleted user 7 years ago
- Auto merge of #2162 - kvark:gpu-upload, r=glennw Scattered GPU cache updates Fixes #2110 This PR introduces the GPU cache uploading via shader writes. It minimizes the amount of data we need to tra... — committed to servo/webrender by deleted user 7 years ago
See also https://bugzilla.mozilla.org/show_bug.cgi?id=1421783 and https://bugzilla.mozilla.org/show_bug.cgi?id=1421784
Yes, that’s my plan 😃 We’ll definitely need separate paths on Windows/Angle versus the world 😃
When I originally set up the GPU cache, the idea was to be able to use an unsynchronized map to update the texture, on platforms where that made sense (i.e. a persistent pointer into the texture data). It does two things to enable this:
In theory, this means we don’t need any synchronization here - these two policies should guarantee that we never write to a location that is < 10 frames old, which should thus guarantee that the GPU is never reading incorrect data.
Admittedly, I’ve never actually tested this in practice - I intended to revisit it at a later time. But perhaps we can signal to ANGLE that it doesn’t need to do any blocking and see how it goes?
Another possibility (more of a temporary quick fix / hack) could be to round-robin a series of backing textures for the GPU cache, and update / upload the entire texture each frame. This sounds bad, but that data is actually quite small, and may be perfectly fine as an interim solution, if most of the time is spend blocking on a GPU fence or similar.
@kvark @jrmuizel These might be worth pursuing if we’re not able to get the current path running well on ANGLE.
The profile shows us hitting the fast path not missing it.
On Nov 27, 2017 8:29 PM, “Glenn Watson” notifications@github.com wrote:
Thanks to Angle team I got some answers! This is not about orphaning, but rather about some texture formats not being supported by the fast path of buffer->texture copies: https://cs.chromium.org/chromium/src/third_party/angle/src/libANGLE/renderer/d3d/d3d11/Renderer11.cpp?type=cs&q=supportsFastCopyBufferToTexture&sq=package:chromium&l=3006
In particular, Angle doesn’t like our RGB8 and A8.