kue: Jobs stuck in inactive state
Jobs get stuck in the inactive state fairly often for us. We noticed that the length of q:[type]:jobs
is zero, even when there are inactive jobs of that type, so when getJob calls blpop
, there is nothing to process.
It looks like this gets set when a job is saved and the state is set to inactive using lpush q:[type]:jobs 1
. We’re wondering if this is failing in some cases and once the count is off, jobs remain unprocessed.
Has anyone else seen this issue?
About this issue
- Original URL
- State: closed
- Created 12 years ago
- Comments: 159 (10 by maintainers)
Links to this issue
Commits related to this issue
- Stuck inactive jobs watchdog, Closes #130 — committed to Automattic/kue by behrad 10 years ago
- Add a redis lua watchdog to fix stuck inactive jobs, fixes #130 — committed to vlad-x/kue by behrad 10 years ago
- Stuck inactive jobs watchdog, Closes #130 — committed to vlad-x/kue by behrad 10 years ago
@theoutlander try bull which has a similar API to Kue, and if you need help ask in the gitter channel: https://gitter.im/OptimalBits/bull The reason I wrote bull in the first place was due to the stuck jobs issue.
If you really care about this issue, latest bull is really at par feature wise with kue but with a non polling, and mostly atomic design, why wait to kue 1.0 when you can use bull? 😃 https://www.npmjs.com/package/bull
DISCLAIMER, I am the author of the package, I started it out of the frustration of some of the long standing issues with kue, which still today are not completely fixed, and can tell by experience that it is not completely trivial to rewrite everything using lua scripts and blocking redis calls…
Who watches the Watchmen? @Caspain
When the queues were not used for a while (sometimes for hours) the blpop seemed to not work for
q:jobs:JOB_TYPE:inactive
. I added a keep-alive that executed every 5 minutes and jobs are no longer stuck!@manast Thanks for creating this. I’m loving it so far. Have ran into a weird issue today (https://github.com/OptimalBits/bull/issues/170)…not sure why. It went away after a while / restarting the IDE (Webstorm).
I haven’t faced any issues with stuck jobs so far! Good work! And great job keeping a similar API…the transition was seamless!
I do a graceful job to inactive shift on process kill or term and that normally handles any stuck jobs. My issue was just a faulty worker
@sbrocher You should exit gracefully, see https://github.com/LearnBoost/kue#graceful-shutdown After that, if it still happens, there should be something related to your connection or redis/node.js process crashes which causes inconsistencies between Kue sets on redis. I can provide a patch which continuously tries to fix that inconsistency if it happens. Then no stuck inactive job will appear. But first, can you all please lemme know if anyone of you see this issue on a local (or on a lan) redis instance?
@mikemoser not sure if “tacked” was the right word… what I tried to said if that for some reason the amount of stuck tasks is some way related to the number of simultaneous tasks indicated on the job.
For example if I do:
4 tasks will be stuck
6 tasks will be stuck
and so on…