quartz: Invalid misfired trigger breaks misfired trigger process
We recently discovered a scenario where we had 20 perpetually misfired triggers in our Quartz scheduler. The jobs associated with these triggers were never being run, and it took a while to figure out why. It turns out there was an “invalid” misfired trigger that was “breaking” the misfire process. Specifically, a RECOVERING_JOBS trigger had a record in QRTZ_TRIGGERS with a TRIGGER_TYPE of ‘SIMPLE’, but there was not a record in QRTZ_SIMPLE_TRIGGERS for it. As a result, each time the MisfireHandler ran, it would error out with an exception on the first trigger and skip the remaining 19 triggers.
08/10/16 12:41:06 [QuartzScheduler_ClusterScheduler-1470857567998_MisfireHandler] INFO - Handling 20 trigger(s) that missed their scheduled fire-time.
08/10/16 12:41:06 [QuartzScheduler_ClusterScheduler-1470857567998_MisfireHandler] ERROR - MisfireHandler: Error handling misfires: Couldn't retrieve trigger: No record found for selection of Trigger with key: 'RECOVERING_JOBS.recover_X.X.X.X1415014951113_1415015028210' and statement: SELECT * FROM QRTZ_SIMPLE_TRIGGERS WHERE SCHED_NAME = 'ClusterScheduler' AND TRIGGER_NAME = ? AND TRIGGER_GROUP = ?
org.quartz.JobPersistenceException: Couldn't retrieve trigger: No record found for selection of Trigger with key: 'RECOVERING_JOBS.recover_X.X.X.X1415014951113_1415015028210' and statement: SELECT * FROM QRTZ_SIMPLE_TRIGGERS WHERE SCHED_NAME = 'ClusterScheduler' AND TRIGGER_NAME = ? AND TRIGGER_GROUP = ?
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveTrigger(JobStoreSupport.java:1533)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.recoverMisfiredJobs(JobStoreSupport.java:979)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.doRecoverMisfires(JobStoreSupport.java:3199)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.manage(JobStoreSupport.java:3947)
at org.quartz.impl.jdbcjobstore.JobStoreSupport$MisfireHandler.run(JobStoreSupport.java:3968)
Caused by: java.lang.IllegalStateException: No record found for selection of Trigger with key: 'RECOVERING_JOBS.recover_X.X.X.X1415014951113_1415015028210' and statement: SELECT * FROM QRTZ_SIMPLE_TRIGGERS WHERE SCHED_NAME = 'ClusterScheduler' AND TRIGGER_NAME = ? AND TRIGGER_GROUP = ?
at org.quartz.impl.jdbcjobstore.SimpleTriggerPersistenceDelegate.loadExtendedTriggerProperties(SimpleTriggerPersistenceDelegate.java:95)
at org.quartz.impl.jdbcjobstore.StdJDBCDelegate.selectTrigger(StdJDBCDelegate.java:1819)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.retrieveTrigger(JobStoreSupport.java:1531)
... 4 common frames omitted
It would be nice if there was some additional error handling to prevent against this scenario so that in the event one trigger has an issue, it doesn’t prevent the other misfired triggers from being handled properly.
Note: of course, the secondary issue here is why the RECOVERING_JOBS trigger got into that invalid state in the first place, but I figure making the MisfireHandler more robust is a good place to start that would have prevented this issue for us.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 17 (1 by maintainers)
We have been able to reproduce this fairly reliably:
@howlettga we have the same problem and it is very much relevant. Can you please share what are bad/good patterns here before this issue is automatically closed?
Yes,I have a same problem! I try different combo api resolve then problem,but failed! dont reschedule or delete And schedule job temp fix … any one other solution??