kue: Jobs Getting Stuck in Active State
Hi @behrad I am still experiencing jobs being stuck in the active
state, similar to issue #391
I am using kue@0.8.11
and redis 2.8.17
I have tried to carefully setup all recommended steps to avoid this. Here are some code snippets. If I’m doing something wrong I would love to hear about it. I’ve just torn out the bits of source that deal with kue/jobs here and redacted stuff that wasn’t relevant.
producer – separate node process
var jobs = kue.createQueue({ disableSearch: true, redis: config.redis });
var dying = false;
process.on('uncaughtException', function (err) {
log.error('uncaught exception', err.stack);
die();
dying = true;
});
process.on('SIGTERM', function () {
log.error('SIGTERM');
die();
dying = true;
});
function die() {
if (!dying) {
jobs.shutdown(function (err) {
if (err) { log.error('Kue DID NOT shutdown gracefully', err); }
else { log.info('Kue DID shutdown gracefully'); }
process.exit(1);
}
}
// createJob( ) is called on an interval as needed
function createJob(data) {
var job = jobs.create('deliver:sub', { title: 'Deliver Subscription', sub: data })
.removeOnComplete(true)
.attempts(5)
.backoff({type: 'exponential'})
.save(function (err) {
if (err) { return log.error('Error creating deliver:sub', err); }
log.info(util.format('job[%s] created', job.id));
});
job.on('complete', function() {
/* logic here */
});
job.on('failed', function () {
log.error(util.format('job[%s] failed ', job.id));
});
}
consumer – separate node process
var jobs = kue.createQueue({ disableSearch: true, redis: config.redis });
try {
kue.app.listen(3001);
} catch (err) {
log.error('Error: could not start kue express server', err);
}
// called in master b/c we use job retries
jobs.promote();
jobs.watchStuckJobs();
var dying = false;
process.on('uncaughtException', function (err) {
log.error('uncaught exception', err.stack);
die();
dying = true;
});
process.on('SIGTERM', function () {
log.error('SIGTERM');
die();
dying = true;
});
function die() {
if (!dying) {
jobs.shutdown(function (err) {
if (err) { log.error('Kue DID NOT shutdown gracefully ', err); }
else { log.info('Kue DID shutdown gracefully'); }
process.exit(1);
});
}
}
jobs.process('deliver:sub', 5, function (job, done) {
var domain = require('domain').create();
domain.on('error', function (err) {
log.error('domain error for deliver:sub', err);
done(err);
});
domain.run(function () {
process.nextTick(function () {
/* logic here that ends with either */
done();
/* OR */
done(err);
});
});
About this issue
- Original URL
- State: open
- Created 9 years ago
- Comments: 38
99% percent of a job being stuck in ACTIVE state is user applications miss to call
done
So first you should try to trace whats happened to your stuck job! and node.js process and worker 😃I had the same problem for a while : when the server is stopped or crashes while a job is being processed, the job stays forever in active state after the server restarts and subsequent jobs are never processed.
I tried, upon server initialization, to put all crashed active jobs in inactive state like described in Programmatic Job Management but it didn’t work very well for me. It unblocked the queue but the crashed jobs that were put in inactive state were never processed, even when I changed their “attempts” option to 10.
Here is how I fixed it :
Upon initialization, I check for any active jobs using
kue.active()
orkue.Job.rangeByState()
. For every active job found, I create a new job with the same data and I calljob.complete()
on the old one. Note that you need to do this upon server initialization and beforequeue.process()
is called.Here is the code I use to do so :
Note that this code is not async so, if you call
queue.process()
soon after, you’ll need to make it async in order to be sure that all active jobs are processed before running the queue.I think what is causing problems is the call to
KEYS
inwatchStuckJobs
, click here to see that line in context.Here’s the problematic part of the Lua script:
The
KEYS
command shouldn’t be run in a production environment. For more information, check the warning in the command’s documentation.