finit: condition not reasserted after crash.
If a service crashes the ready condition does not get set. Causing other daemons that rely on it to be paused.
# Start Service
<service/running> is set
<service/ready> is set
# Crash the service in the background using kill command or similar
<service/running> is cleared
<service/ready> is cleared
service_retry() will eventually print "Successfully restarted crashing service"
svc_set_state() clears all the conditions but then only sets "running".
Which results in:
<service/running> is set
<service/ready> is cleared
This also occurs for systemd ready flags (the use case that I care about):
<service/running> is set
<service/ready> is set
# Service crashes.
<service/running> is cleared
<service/ready> is cleared
# Daemon asserts READY=1
<service/ready> gets set
# Service restart timeout triggers
svc_set_state() clears all the conditions and only reapplies <service/running>
The fix isn’t obvious to me on this one. I feel like the service is going to need to keep track of its ready state somewhere, or perhaps if svc->notify then don’t clear the ready on a svc_set_state(RUNNING).
Edit: Wild thought. You could add an additional SVC_STATE too. `Halted->Waiting->Preparing (only valid for notif services) -> Running"
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 20 (20 by maintainers)
Commits related to this issue
- test: verify readiness notification on crash/restart Issue #343 reports that readiness notification on crash/restart is lost. This update to notify.sh reproduces that bug. Signed-off-by: Joachim Wib... — committed to troglobit/finit by troglobit a year ago
- Fix #343: READY state lost when service is restarted after crash The svc_set_state() function not only handles state transitions, it also makes sure to cancel any outstaing service timeouts. Before ... — committed to troglobit/finit by troglobit a year ago
- test: extend notify.sh with checks for 'restart serv' and 'reload' Issue #343. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> — committed to troglobit/finit by troglobit a year ago
- test: s6/systemd style services do not necessarily create a PID file Ensure the test mimics actual real-world scenario where daemons do not create a PID file at all. Issue #343. Signed-off-by: Joac... — committed to troglobit/finit by troglobit a year ago
- Fix #343: handle 'initctl reload' of unmodified non-native services When 'initctl reload' is called new "configuration generation" is started by Finit. This mechanism ensures services reaffirm their... — committed to troglobit/finit by troglobit a year ago
- Issue #343: READY state lost when service is restarted after crash The svc_set_state() function not only handles state transitions, it also makes sure to cancel any outstaing service timeouts. Befor... — committed to troglobit/finit by troglobit a year ago
- test: extend notify.sh with checks for 'restart serv' and 'reload' Issue #343. Signed-off-by: Joachim Wiberg <troglobit@gmail.com> — committed to troglobit/finit by troglobit a year ago
- test: s6/systemd style services do not necessarily create a PID file Ensure the test mimics actual real-world scenario where daemons do not create a PID file at all. Issue #343. Signed-off-by: Joac... — committed to troglobit/finit by troglobit a year ago
- test: reproduce pidfile plugin marking systemd services 'started' Issue #343 Signed-off-by: Joachim Wiberg <troglobit@gmail.com> — committed to troglobit/finit by troglobit a year ago
Messed around my MyLinux tonight to hopefully help reproduce this without having to deal with censoring logs. Running master finit.
Minimal code to reproduce:
Some logs that show off the behaviour.
Now this is a little tricker to show over text. But if you spam
initctl condyou can see ready gets set, then clears after the 2 or 5 second timeout.