sidekiq-unique-jobs: incorrect `:while_executing` behavior

Describe the bug enqueue the job with while_executing lock policy and with uniqueness by the first argument 3 times with interval 1 second

Expected behavior all jobs should be placed in the queue, second job should wait until first one is finished, and then third job should wait until second one is finished

Current behavior all jobs are placed in the queue, but only first and second job are performed. moreover, if we enqeue 3 jobs again, now only first job will be performed

Worker class

class TestJob
  include Sidekiq::Worker

  sidekiq_options(
    queue: 'test_queue',
    unique: :while_executing,
    unique_args: -> (args) do
      [
        args[0]
      ]
    end
  )

  def perform(*arguments)
    Rails.logger.info "!!! start with #{arguments.inspect}"
    (1..120000).inject(:*)
    Rails.logger.info '!!! finish'
  end
end

Additional context here is a log from sidekiq (5.2.5) and sidekiq-unique-jobs (5.0.10), where it works correctly:

>> TestJob.perform_async 'foo', 1
=> "aa45c34edcfaa56b114e78db"
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 0
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0
>> TestJob.perform_async 'foo', 2
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 0
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0
=> "72397550278883430400c2b7"
>> TestJob.perform_async 'foo', 3
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 1
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0.0016357898712158203
=> "aaecc71644ab1aa2b0ae2094"



2019-10-10T13:28:50.100Z 675 TID-grow3h2rb TestJob JID-aa45c34edcfaa56b114e78db INFO: start
2019-10-10T13:28:51.547Z 675 TID-grow3h2i7 TestJob JID-72397550278883430400c2b7 INFO: start
2019-10-10T13:28:51.853Z 675 TID-grow3h2rb TestJob JID-aa45c34edcfaa56b114e78db INFO: !!! start with ["foo", 1]
2019-10-10T13:29:01.265Z 675 TID-grow3h2rb TestJob JID-aa45c34edcfaa56b114e78db INFO: !!! finish
2019-10-10T13:29:01.266Z 675 TID-grow3h2rb TestJob JID-aa45c34edcfaa56b114e78db INFO: done: 11.166 sec
2019-10-10T13:29:01.267Z 675 TID-grow3h2rb TestJob JID-aaecc71644ab1aa2b0ae2094 INFO: start
2019-10-10T13:29:02.062Z 675 TID-grow3h2i7 TestJob JID-72397550278883430400c2b7 INFO: !!! start with ["foo", 2]
2019-10-10T13:29:09.887Z 675 TID-grow3h2i7 TestJob JID-72397550278883430400c2b7 INFO: !!! finish
2019-10-10T13:29:09.887Z 675 TID-grow3h2i7 TestJob JID-72397550278883430400c2b7 INFO: done: 18.34 sec
2019-10-10T13:29:09.925Z 675 TID-grow3h2rb TestJob JID-aaecc71644ab1aa2b0ae2094 INFO: !!! start with ["foo", 3]
2019-10-10T13:29:18.445Z 675 TID-grow3h2rb TestJob JID-aaecc71644ab1aa2b0ae2094 INFO: !!! finish
2019-10-10T13:29:18.445Z 675 TID-grow3h2rb TestJob JID-aaecc71644ab1aa2b0ae2094 INFO: done: 17.179 sec

and this is from sidekiq (6.0.1), sidekiq-unique-jobs (7.0.0.beta2)

>> TestJob.perform_async 'foo', 1
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 0
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0
=> "35c69d21c26b4385ac0f1e14"
>> TestJob.perform_async 'foo', 2
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 0
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0
=> "9b8a444211a6287952cb03d6"
>> TestJob.perform_async 'foo', 3
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/size, 1
[Newrelic::CustomMetricService] record Custom/Sidekiq::Queue::TestQueue/latency, 0.0014061927795410156
=> "2b145d5d10fb3dd9523c6f4c"


2019-10-10T13:39:42.642Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 INFO: start
2019-10-10T13:39:44.146Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 INFO: start
2019-10-10T13:39:44.666Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed queue.lua in 1ms
2019-10-10T13:39:44.668Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed lock.lua in 2ms
2019-10-10T13:39:44.668Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN INFO: !!! start with ["foo", 1]
2019-10-10T13:39:54.411Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN INFO: !!! finish
2019-10-10T13:39:54.412Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed unlock.lua in 2ms
2019-10-10T13:39:54.413Z pid=11435 tid=gq64bagkv class=TestJob jid=35c69d21c26b4385ac0f1e14 elapsed=11.771 INFO: done
2019-10-10T13:39:54.414Z pid=11435 tid=gq64bagkv class=TestJob jid=2b145d5d10fb3dd9523c6f4c INFO: start
2019-10-10T13:39:55.272Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed queue.lua in 0ms
2019-10-10T13:39:55.274Z pid=11435 tid=gq64bagkv class=TestJob jid=2b145d5d10fb3dd9523c6f4c uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed queue.lua in 1ms
2019-10-10T13:39:55.275Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed lock.lua in 0ms
2019-10-10T13:39:55.275Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN INFO: !!! start with ["foo", 2]
2019-10-10T13:39:55.275Z pid=11435 tid=gq64bagkv class=TestJob jid=2b145d5d10fb3dd9523c6f4c uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN WARN: Timed out after 0s while waiting for primed token (digest: uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN, job_id: 2b145d5d10fb3dd9523c6f4c)
2019-10-10T13:39:55.470Z pid=11435 tid=gq64bagkv class=TestJob jid=2b145d5d10fb3dd9523c6f4c uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed unlock.lua in 98ms
2019-10-10T13:39:55.830Z pid=11435 tid=gq64bagkv class=TestJob jid=2b145d5d10fb3dd9523c6f4c elapsed=1.416 INFO: done
2019-10-10T13:40:04.356Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN INFO: !!! finish
2019-10-10T13:40:04.357Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 uniquejobs=server =uniquejobs:c3bd986d71c5f4ff21ed14cbafae3b28:RUN DEBUG: Executed unlock.lua in 1ms
2019-10-10T13:40:04.358Z pid=11435 tid=gq64bah33 class=TestJob jid=9b8a444211a6287952cb03d6 elapsed=20.212 INFO: done

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 20 (10 by maintainers)

Most upvoted comments

@mhenrixon I’ve tried with config.cache_classes = true and config.eager_load = true - and got the same result

@roman-melnyk it depends on your configuration. If you have a lock_timeout set it only waits that long from when the job is picked of the queue. I think there should be some log entry for that. Perhaps add on_conflict: :log to debug.

If you want to make sure no job is lost you can use on_conflict: :raise which will allow the job to be retried. You could also reschedule it.

@mhenrixon 👍🏼 confirming correct work in 7.0.4 version

@roman-melnyk can you try to upgrade sidekiq-unique-jobs and let me know if you are still having trouble? I’ve released quite a few bug fixes since you posted this.

@roman-melnyk the delay on push will be because you are having a lock timeout of 5 minutes. I strongly recommend against that but it shows the need for separating the lock timeout for client and server middlewares.