glfw: glfwPostEmptyEvent sometimes fails to unblock glfwWaitEvents on Ubuntu 18.04
From times to times, the empty event posted by a call to glfwPostEmptyEvent won’t unblock a call to glfwWaitEvent in the main thread.
A minimal program reproducing the issue is available here
It can be built and run on linux like so:
g++ main.cpp -std=c++14 -lglfw -lX11 -ldl -lpthread && ./a.out
If the calls of glfwPostEmptyEvent and glfwWaitEvent don’t overlap in time, then the bug will never happen, but if the 2 calls are concurrent, the bug happens, “sometimes”.
Note that on OSX, this bug never happens.
GLFW Version (from glfwGetVersionString()):
3.2.1 X11 GLX EGL clock_gettime /dev/js Xf86vm shared
Ubuntu version
18.04 LTS
- Note that I have tried posting “several” empty events at the same time: it doesn’t help either, it just makes the bug a little less reproducible but doesnt fix it entirely.
Edit : added the build command Edit2 : fixed the build command
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 63 (30 by maintainers)
Commits related to this issue
- Update glfw from upstream Fixes https://github.com/glfw/glfw/issues/1281 — committed to kovidgoyal/kitty by kovidgoyal 6 years ago
- Fix for https://github.com/glfw/glfw/issues/1281 — committed to OlivierSohn/glfw by kovidgoyal 6 years ago
- X11: Fix posted empty events sometimes being lost Fixes #1281. — committed to glfw/glfw by elmindreda 5 years ago
- Fix for https://github.com/glfw/glfw/issues/1281 Refactor common code into backend_utils.h Factorize common logic between wayland and x11, add doc. Add doc, minor change Minor change Change initi... — committed to OlivierSohn/glfw by kovidgoyal 6 years ago
- Fix #1281: glfwPostEmptyEvent sometimes doesn't wake up glfwWaitEvents on X11. This is a rare race that triggers often when running GLFW programs on GitHub Actions with Xvfb. The underlying issue is... — committed to joaodasilva/glfw by joaodasilva 2 years ago
- X11: Fix empty event race condition with a pipe Fixes #1281 Closes #2033 Related to #379 Related to #1285 — committed to elmindreda/glfw by elmindreda 2 years ago
- X11: Fix empty event race condition with a pipe There is a seemingly unavoidable race condition when waiting for data on the X11 display connection, as long as any other thread is also making Xlib ca... — committed to glfw/glfw by elmindreda 2 years ago
- X11: Use lower-latency poll where available This uses ppoll for waiting on file descriptors with a timeout, where that function has been available a while. On NetBSD, which will be getting ppoll in ... — committed to glfw/glfw by elmindreda 2 years ago
- Share X11 fd polling logic with Wayland This moves the X11 polling implementation to a separate file where it can be used by either the X11 or Wayland backend or both. This code should be POSIX comp... — committed to glfw/glfw by elmindreda 2 years ago
- X11: Fix empty event race condition with a pipe There is a seemingly unavoidable race condition when waiting for data on the X11 display connection, as long as any other thread is also making Xlib ca... — committed to glfw/glfw by elmindreda 2 years ago
- X11: Use lower-latency poll where available This uses ppoll for waiting on file descriptors with a timeout, where that function has been available a while. On NetBSD, which will be getting ppoll in ... — committed to glfw/glfw by elmindreda 2 years ago
- X11: Fix empty event race condition with a pipe There is a seemingly unavoidable race condition when waiting for data on the X11 display connection, as long as any other thread is also making Xlib ca... — committed to swarnimarun/glfw-meson by elmindreda 2 years ago
- X11: Use lower-latency poll where available This uses ppoll for waiting on file descriptors with a timeout, where that function has been available a while. On NetBSD, which will be getting ppoll in ... — committed to swarnimarun/glfw-meson by elmindreda 2 years ago
Done. And just for completeness, here is my analysis of the root cause:
basically, depending on the exact timing of the calls, it can happen that Xlib on the GUI thread reads data of the fd and stores it in its internal cache, which causes the select() to block, even though the other thread has sent an event that the main thread does not yet see.
This is fundamentally because of the bad (or perhaps more correct ancient) design of Xlib. it really is is not made for multiplexed I/O
Its non-blocking, so the writes will simply fail. The write failure is ignored, so it is harmless.
The reason the window is needed is because of line 1115 in window.c Basically glfwpostemptyevent is a no-op when no windows are present. I dont know why that is the case, probably needed by some other backend.