quartz: Triggers are getting blocked permanently
Dear Quartz Team,
We are using Quartz 2.2.1 in clustered-mode with JDBC job store to schedule jobs marked as @DisallowConcurrentExecution
.
We have observed that occasionally triggers are getting stuck in trigger state BLOCKED
without ever recovering automatically. Looking into the job store DB tables, the pattern is always the same:
-
The
TRIGGER_STATE
on<PREFIX>_TRIGGERS
is in stateBLOCKED
-
There is no corresponding record in
<PREFIX>_FIRED_TRIGGERS
Obviously org.quartz.impl.jdbcjobstore.JobStoreSupport.clusterRecover(Connection, List<SchedulerStateRecord>)
will not recover such triggers, so the only way to get out of this inconsistent state is to manually set the TRIGGER_STATE
back to WAITING
.
It is not yet clear under which circumstances this error occurs. However, our log files indicate that jobs getting stuck coincides with temporary database problems.
Below you can find an example of a NullPointerException
in org.quartz.impl.jdbcjobstore.JobStoreSupport.triggersFired(List<OperableTrigger>)
. The exception itself was caused somewhere in the JDBC driver (Sybase jConnect) when trying to invoke rollback()
on a JDBC connection. The log entry’s timestamp correlates exactly with the time the trigger got stuck.
2017 05 01 20:20:02#+00#ERROR#org.quartz.core.QuartzSchedulerThread##anonymous#ItOpScheduler_Clustered_QuartzSchedulerThread#Runtime error occurred in main trigger firing loop.java.lang.NullPointerException: while trying to invoke the method com.sybase.jdbc4.tds.TdsCursor.setRowNum(int) of a null object loaded from field com.sybase.jdbc4.tds.CurInfo3Token._cursor of an object loaded from local variable 'this'
at com.sybase.jdbc4.tds.CurInfo3Token.getMetaInformation(CurInfo3Token.java:85)
at com.sybase.jdbc4.tds.CurInfoToken.<init>(CurInfoToken.java:130)
at com.sybase.jdbc4.tds.CurInfo3Token.<init>(CurInfo3Token.java:45)
at com.sybase.jdbc4.tds.Tds.nextResult(Tds.java:3239)
at com.sybase.jdbc4.tds.Tds.readCommandResults(Tds.java:4459)
at com.sybase.jdbc4.tds.Tds.doCommand(Tds.java:4444)
at com.sybase.jdbc4.tds.Tds.endTransaction(Tds.java:2602)
at com.sybase.jdbc4.jdbc.SybConnection.rollback(SybConnection.java:1953)
at sun.reflect.GeneratedMethodAccessor492.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.sap.core.persistence.jdbc.trace.TraceableBase$1.invoke(TraceableBase.java:44)
at com.sun.proxy.$Proxy17.rollback(Unknown Source)
at com.sap.core.persistence.jdbc.trace.TraceableConnection.rollback(TraceableConnection.java:239)
at org.apache.commons.dbcp.DelegatingConnection.rollback(DelegatingConnection.java:368)
at org.apache.commons.dbcp.DelegatingConnection.rollback(DelegatingConnection.java:368)
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.rollback(PoolingDataSource.java:323)
at sun.reflect.GeneratedMethodAccessor492.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.quartz.impl.jdbcjobstore.AttributeRestoringConnectionInvocationHandler.invoke(AttributeRestoringConnectionInvocationHandler.java:73)
at com.sun.proxy.$Proxy143.rollback(Unknown Source)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.rollbackConnection(JobStoreSupport.java:3658)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3817)
at org.quartz.impl.jdbcjobstore.JobStoreSupport.triggersFired(JobStoreSupport.java:2908)
at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:336)
|
Please let me know if you need additional details.
Thanks for your support, Sebastian
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 22 (2 by maintainers)
Commits related to this issue
- release BLOCKED triggers in releaseAcquiredTrigger * relates to https://github.com/quartz-scheduler/quartz/pull/146 , https://github.com/quartz-scheduler/quartz/issues/145 * relates to #741 #800 — committed to quartznet/quartznet by lahma 5 years ago
- release BLOCKED triggers in releaseAcquiredTrigger * applying commit https://github.com/quartznet/quartznet/commit/05fd35c0f86553506300fcbf251338afeca5f109 * related to https://github.com/quartz-sch... — committed to SilviaDGregorio/cosmosdb-quartznet by deleted user 2 years ago
I was able to reproduce the issue in the debugger.
If
org.quartz.impl.jdbcjobstore.JobStoreSupport.triggerFired(Connection, OperableTrigger)
throws a RuntimeException after the trigger state has been set toBLOCKED
, the trigger will get stuck. The reason for this is thatQuartzSchedulerThead.run()
will callorg.quartz.spi.JobStore.releaseAcquiredTrigger(OperableTrigger)
in case of RuntimeExceptions, which will delete the record from<PREFIX>_FIRED_TRIGGERS
but will not set back the trigger state fromBLOCKED
toWAITING
.Probably
org.quartz.impl.jdbcjobstore.JobStoreSupport.releaseAcquiredTrigger(Connection, OperableTrigger)
should set back the trigger state toWAITING
from bothACQUIRED
(which it already does) and fromBLOCKED
.Best Regards, Sebastian
Our team has deployed latest version 2.3.2 quartz JAR last Friday in prod server. But we are still immediately experiencing BLOCKED triggers in the qrtz_triggers table for our email notification after catalina restart. Is there any followup/advice for this behavior? Would enabling TRACE logging on “org.quartz” package help? We have a clustered option set in the quartz.properties and the quartz tables are in the same db schema as our app tables.
Has anyone enabled JMX remote access to mbeans as a potential workaround to reset trigger state to WAITING and then immediately firing trigger? https://dzone.com/articles/how-manage-quartz-remotely
Most of these blocked triggers change the trigger state from BLOCKED to WAITING automatically since added the JobListner and upgrade quartz from 2.3.0 to the latest version 2.3.2. But still exists one or two BLOCKED triggers in my case.
After 6 hours trying to find a solution I bumped into your answer and it was exactly what was happening in my code. Thank you
For those who are still seeing this issue and if you implemented the
JobListener
interface, make sure you handle the exception yourself withinjobWasExecuted
method as quartz does not handle exception thrown in that method and that could leave your job state in BLOCKED and never get recovered. We experienced it with Quartz version 2.3.0