webrender: PBO copies are slow on Angle

It appears that Windows/Angle spends tons of time in update_texture_from_pbo: https://perfht.ml/2zJnkg4 Angle decided to defer mapping and filling an actual D3D11 staging buffer up until this call. Since the copy itself is done in a draw call, it has to stall the GPU until the copy is done. This is one issue, but the actual current one is different: Map itself is waiting. I suppose it’s waiting for the GPU considering the texture is still in use, which implies our PBO orphaning doesn’t work as expected (see Device::orphan_pbo).

cc @jrmuizel @glennw

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 17 (11 by maintainers)

Commits related to this issue

Most upvoted comments

Yes, that’s my plan 😃 We’ll definitely need separate paths on Windows/Angle versus the world 😃

When I originally set up the GPU cache, the idea was to be able to use an unsynchronized map to update the texture, on platforms where that made sense (i.e. a persistent pointer into the texture data). It does two things to enable this:

  • Items are never evicted that are newer than an arbitrary amount of frames (currently set to 10).
  • When items are invalidated for update, the old location is orphaned - that is, a new location is selected for the updated data, and the old location is left to be evicted by the cache by the eviction policy mentioned above.

In theory, this means we don’t need any synchronization here - these two policies should guarantee that we never write to a location that is < 10 frames old, which should thus guarantee that the GPU is never reading incorrect data.

Admittedly, I’ve never actually tested this in practice - I intended to revisit it at a later time. But perhaps we can signal to ANGLE that it doesn’t need to do any blocking and see how it goes?

Another possibility (more of a temporary quick fix / hack) could be to round-robin a series of backing textures for the GPU cache, and update / upload the entire texture each frame. This sounds bad, but that data is actually quite small, and may be perfectly fine as an interim solution, if most of the time is spend blocking on a GPU fence or similar.

@kvark @jrmuizel These might be worth pursuing if we’re not able to get the current path running well on ANGLE.

The profile shows us hitting the fast path not missing it.

On Nov 27, 2017 8:29 PM, “Glenn Watson” notifications@github.com wrote:

Awesome, thanks for following this up!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/servo/webrender/issues/2110#issuecomment-347383918, or mute the thread https://github.com/notifications/unsubscribe-auth/AAUTbYZ7dMM78BEAVnp_cSJJxA_sLIsIks5s62HlgaJpZM4QsHT- .

Thanks to Angle team I got some answers! This is not about orphaning, but rather about some texture formats not being supported by the fast path of buffer->texture copies: https://cs.chromium.org/chromium/src/third_party/angle/src/libANGLE/renderer/d3d/d3d11/Renderer11.cpp?type=cs&q=supportsFastCopyBufferToTexture&sq=package:chromium&l=3006

In particular, Angle doesn’t like our RGB8 and A8.