quartznet: Scheduler cannot recover from database connection error
Version: 3.0.7
Expected behavior
If a job is scheduled using a simple repeating trigger then a temporary loss of a database connection should be recoverable.
Actual behavior
Jobs scheduled on that scheduler does never recover and start working normally again.
Steps to reproduce
- Setup a database and configure the quartz standard scheduler factory to use it. In my example I’m using PostgreSQL.
- Create a scheduler and schedule a job with a simple trigger that runs every 10 seconds.
- While you know that the job is not executing, restart the PostgreSQL database service.
- See that the job configured with a simple trigger that should run every ten seconds no longer runs, even if you wait for 5 minutes.
The example uses .NET Core 2.2, NpgSQL 4.0.7 and Quartz 3.0.7
using Quartz;
using Quartz.Impl;
using System;
using System.IO;
using System.Threading.Tasks;
namespace QuartzConnectionLossExample
{
class Program
{
// Please don't target this against a database that you are using for other stuff.
private static readonly string _connectionString = "Host=localhost;Port=5432;Database=quartztest;UserName=postgres;Password=postgres";
/// <summary>
/// Path to the script that sets up quartz database tables. Set to null if you don't want to clear the public schema and create quartz database objects.
/// </summary>
private static readonly string _quartzSQLScriptPath = "Sql/quartzSetupScript.sql";
private static readonly bool dontSetup = false;
private static readonly System.Collections.Specialized.NameValueCollection quartzConfig =
new System.Collections.Specialized.NameValueCollection()
{
{ "quartz.serializer.type", "json" },
{ "quartz.jobStore.type", "Quartz.Impl.AdoJobStore.JobStoreTX, Quartz" },
{ "quartz.jobStore.driverDelegateType"," Quartz.Impl.AdoJobStore.PostgreSQLDelegate, Quartz" },
{ "quartz.jobStore.tablePrefix", "QRTZ_" },
{ "quartz.jobStore.dataSource", "PostgreSql" },
{ "quartz.dataSource.PostgreSql.connectionString", _connectionString },
{ "quartz.dataSource.PostgreSql.provider", "Npgsql" },
{ "quartz.plugin.triggHistory.type", "Quartz.Plugin.History.LoggingJobHistoryPlugin, Quartz.Plugins" },
{ "quartz.threadPool.threadCount", "2" }
};
private static readonly StdSchedulerFactory _schedulerFactory = new StdSchedulerFactory(quartzConfig);
static void Main(string[] args)
{
Console.WriteLine("Example start.");
if (!dontSetup)
{
SetupDatabase();
}
var scheduler = _schedulerFactory.GetScheduler().GetAwaiter().GetResult();
SetupJobAndSimpleTrigger(scheduler);
Console.WriteLine("Scheduler started, feel free to restart your database service and see if it can recover from that.");
Console.ReadLine();
}
private static void SetupJobAndSimpleTrigger(IScheduler scheduler)
{
IJobDetail job = null;
if (!scheduler.CheckExists(new JobKey("TestJob", "TestGroup")).GetAwaiter().GetResult())
{
job = JobBuilder.Create<TestJob>()
.WithIdentity(new JobKey("TestJob", "TestGroup"))
.WithDescription("Test.")
.RequestRecovery(true)
.StoreDurably(true)
.Build();
}
else
{
job = scheduler.GetJobDetail(new JobKey("TestJob", "TestGroup")).GetAwaiter().GetResult();
}
if (!scheduler.CheckExists(new TriggerKey("TestTrigger", "TestGroup")).GetAwaiter().GetResult() && job != null)
{
var trigger = TriggerBuilder.Create()
.WithIdentity(new TriggerKey("TestTrigger", "TestGroup"))
.WithDescription("Runs every 10 seconds")
.WithSimpleSchedule(x => x
.WithIntervalInSeconds(10)
.RepeatForever()
.WithMisfireHandlingInstructionIgnoreMisfires()
)
.StartNow()
.EndAt(null)
.Build();
scheduler.ScheduleJob(job, trigger).GetAwaiter().GetResult();
}
scheduler.Start().GetAwaiter().GetResult();
}
static void SetupDatabase()
{
#region Clean public schema and setup quartz database objects.
if (!string.IsNullOrWhiteSpace(_quartzSQLScriptPath))
{
Console.WriteLine("Setting up database");
using (var connection = new Npgsql.NpgsqlConnection(_connectionString))
{
connection.Open();
using (var command = connection.CreateCommand())
{
command.CommandText =
$@"
DROP SCHEMA ""public"" CASCADE;
CREATE SCHEMA ""public"";
" + File.ReadAllText(_quartzSQLScriptPath);
using (var transaction = connection.BeginTransaction())
{
command.Transaction = transaction;
command.ExecuteNonQuery();
transaction.Commit();
}
}
}
Console.WriteLine("Database setup completed");
}
else
{
Console.WriteLine("No setup script specified, setup skipped.");
}
#endregion
}
}
// Once you hit the breakpoint below you should hit continue immediately, and then immediately try to restart the database.
// Basically it will only fail if the database connection error occurs before the Execute method is called.
[DisallowConcurrentExecution]
public class TestJob : IJob
{
public Task Execute(IJobExecutionContext context)
{
System.Diagnostics.Debugger.Break();
return Task.Delay(1000);
}
}
}
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 31 (16 by maintainers)
Commits related to this issue
- Update the PostGreSql delegate to provide retries. — committed to SeanFarrow/quartznet by SeanFarrow 5 years ago
I am having the exact same issue with MS SQL. ERROR Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder (null) Staging - Couldn’t rollback ADO.NET connection. Transaction not connected, or was disconnected System.InvalidOperationException: Transaction not connected, or was disconnected at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.CheckNotZombied() at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.Rollback(Boolean transientError)
Is there a workaround? The issues is that ALL of the jobs are halted until the service is restarted.
I’m having this issue also, using MS SQL.
2019-12-06 13:18:50,276 [91] ERROR Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder (null) Staging - Couldn’t rollback ADO.NET connection. Transaction not connected, or was disconnected System.InvalidOperationException: Transaction not connected, or was disconnected at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.CheckNotZombied() at Quartz.Impl.AdoJobStore.ConnectionAndTransactionHolder.Rollback(Boolean transientError)