conmon: Conmon hang after wake from sleep
On my two systems, one Silverblue 35 and one Silverblue 36, I sporadically get a conmon hang, using one core fully, right after waking from sleep. Usually happens in the morning.
$ conmon --version
conmon version 2.1.0
commit:
Unsure what info to provide; if debugging is needed I do not mind trying to get a stack trace with debug symbols.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (8 by maintainers)
Commits related to this issue
- Stop using g_unix_signal_add() to avoid threads g_unix_signal_add() is implemented via an implicitly-created worker thread in GLib. This thread creates an eventfd, and conflicts with conmon's genera... — committed to allisonkarlitskaya/conmon by allisonkarlitskaya 2 years ago
- Stop using g_unix_signal_add() to avoid threads g_unix_signal_add() is implemented via an implicitly-created worker thread in GLib. This thread creates an eventfd, and conflicts with conmon's genera... — committed to allisonkarlitskaya/conmon by allisonkarlitskaya 2 years ago
- Stop using g_unix_signal_add() to avoid threads g_unix_signal_add() is implemented via an implicitly-created worker thread in GLib. This thread creates an eventfd, and conflicts with conmon's genera... — committed to allisonkarlitskaya/conmon by allisonkarlitskaya 2 years ago
I managed to get a strace and a gdb backtrace of a 100% CPU pinned conmon. The issue is pretty straightforward.
strace:
gdbt a a btSo it’s the glib worker thread that’s spinning. There’s not a lot of fds that get registered in that main context, but every main context has an eventfd used for cross-thread wakeups. Someone closed that.
This looks like it was introduced in e2215a1c4c01c25f2fc1206ad4df012d10374b99, which is recent enough that it’s consistent with the bug being discovered in the last months.
I took a very brief look at the (small) subset of GLib’s API that conmon is actually using, and if I had to make an educated guess, I’d say it’s the signal handlers stuff that is causing the worker thread to be spun up. That’s irrelevant though: the core issue here is that you can’t just randomly close all fds like that.
Out of interest, as this is still causing CPU/fan overheating (especially during this hot summer) – is that fix going to go into Fedora soon? Or wasn’t it fixed after all yet? I still have to
pkill -9 conmonevery day.