openrc: rc: do not call free/malloc in signal handlers
I’ve spent part of this week chasing after a dead lock in “openrc shutdown” (running shutdown or reboot in a loop hangs once every few days in a malloc() call):
https://www.openwall.com/lists/musl/2023/01/23/1
It turns out the problem is almost certainly that we’re doing a free in our SIGCHLD handler:
static void handle_signal(int sig) {
...
/* Remove that pid from our list */
if (pid > 0)
remove_pid(pid);
...
static void remove_pid(pid_t pid) {
...
LIST_REMOVE(p, entries);
free(p);
free/malloc are not signal-safe, in particular musl libc does not take any lock as long as the process does not create threads – if there had been we would likely just have seen a deadlock instead – so if we have a stack like the following (bottom to top):
openrc functions...
malloc()
handle_signal()
free()
the free can be called while the malloc state is inconsistent and further corrupt mallocng’s internal state. (Yes that probably wouldn’t cause problems without rc_parallel, call me a red neck)
Anyway, there are a few more mallocs in the SIGUSR1’s rc_plugin_run but as a first approximation I’d like to make that remove_pid delay its free. The most straight-forward solution I can think of would be to have the signal handler not call remove_pid, but instead move the list link to a “to free later” list that can be checked once in a while during the main thread.
That still is quite a bit of overengineering, but I cannot think of anything else that would be safe short of just leaking the memory. Would that be acceptable?
I’ll open a PR early next week unless someone beats me to it.
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 19 (10 by maintainers)
Commits related to this issue
- openrc: avoid unnecessary malloc inside sig-handler malloc (called by xasprintf) is not async-signal-safe. beside, the string here is constant, so there's no need to malloc it all. eerrorx isn't asy... — committed to N-R-K/openrc by N-R-K a year ago
- openrc: avoid unnecessary malloc inside sig-handler malloc (called by xasprintf) is not async-signal-safe. beside, the string here is constant, so there's no need to malloc it all. eerrorx isn't asy... — committed to N-R-K/openrc by N-R-K a year ago
- openrc: avoid unnecessary malloc inside sig-handler malloc (called by xasprintf) is not async-signal-safe. beside, the string here is constant, so there's no need to malloc it all. eerrorx isn't asy... — committed to N-R-K/openrc by N-R-K a year ago
- openrc: avoid unnecessary malloc inside sig-handler malloc (called by xasprintf) is not async-signal-safe. beside, the string here is constant, so there's no need to malloc it all. eerrorx isn't asy... — committed to OpenRC/openrc by N-R-K a year ago
- rc: avoid calling free inside SIGCHLD handler `free` is not async-signal-safe and calling it inside a signal handler can have bad effects, as reported in the musl ML: https://www.openwall.com/lists/m... — committed to N-R-K/openrc by N-R-K a year ago
- rc: block signals during pid list operations the pid list will be accessed inside the signal handler. so we must ensure that a list operation doesn't get interrupted by a signal leaving the list at a... — committed to N-R-K/openrc by N-R-K a year ago
- rc: avoid calling free inside SIGCHLD handler `free` is not async-signal-safe and calling it inside a signal handler can have bad effects, as reported in the musl ML: https://www.openwall.com/lists/m... — committed to N-R-K/openrc by N-R-K a year ago
- rc: block SIGCHLD during pid list operations the pid list will be accessed inside the SIGCHLD signal handler. so we must ensure SIGCHLD handler doesn't get invoked while the list is at an inconsisten... — committed to N-R-K/openrc by N-R-K a year ago
- rc: block SIGCHLD during pid list operations the pid list will be accessed inside the SIGCHLD signal handler. so we must ensure SIGCHLD handler doesn't get invoked while the list is at an inconsisten... — committed to N-R-K/openrc by N-R-K a year ago
- openrc-run: avoid malloc inside sig-handler same rational as 459783bb Bug: https://github.com/OpenRC/openrc/issues/589 — committed to N-R-K/openrc by N-R-K a year ago
- start-stop-daemon: avoid malloc inside sig-handler same rational as 459783bb Bug: https://github.com/OpenRC/openrc/issues/589 — committed to N-R-K/openrc by N-R-K a year ago
- openrc-run: avoid malloc inside sig-handler same rational as 459783bb Bug: https://github.com/OpenRC/openrc/issues/589 — committed to OpenRC/openrc by N-R-K a year ago
- start-stop-daemon: avoid malloc inside sig-handler same rational as 459783bb Bug: https://github.com/OpenRC/openrc/issues/589 — committed to OpenRC/openrc by N-R-K a year ago
- rc: avoid calling free inside SIGCHLD handler `free` is not async-signal-safe and calling it inside a signal handler can have bad effects, as reported in the musl ML: https://www.openwall.com/lists/m... — committed to OpenRC/openrc by N-R-K a year ago
- rc: block SIGCHLD during pid list operations the pid list will be accessed inside the SIGCHLD signal handler. so we must ensure SIGCHLD handler doesn't get invoked while the list is at an inconsisten... — committed to OpenRC/openrc by N-R-K a year ago
OpenRC needs to respect async-signal-safe settings regardless of difficulty
I agree, merely toggling a flag is certainly the best handler imagineable. Checking after i.e. waitpid() and postponing maintenance till then should usually do the trick.
Just added in my 2 cents, since I opened https://bugs.gentoo.org/434532 a ‘little while’ back.