bullmq: [Bug]: Worker stopped processing jobs, and mostly delayed Jobs
Version
v5.4.2
Platform
NodeJS
What happened?
We have a service, where a worker runs, and processes jobs. After the processing is done, it will create another job, which is delayed (around 64 minutes). Today, I noticed that the service and worker stopped processing jobs. There were no error messages in the logs. When I used BullBoard (I use it as a UI to see jobs), I saw the jobs were still in the delayed state, and like 24 hours overdue.
When I restarted the service, and the worker started, it immediately started processing those delayed jobs. This is not the first it happened. Today I though first checked the delayed jobs.
In today’s incident, the service has been running for 4 days.
We run in EKS on AWS (NodeJS service, using Typescript). I use BullMQ Pro. And we are using Groups and each Group has a concurrency set to 1.
How to reproduce.
I don’t have any test code for this
Relevant log output
No Logs or error logs were produced
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 4 months ago
- Reactions: 1
- Comments: 91 (35 by maintainers)
Commits related to this issue
- chore(release): 5.7.1 [skip ci] ## [5.7.1](https://github.com/taskforcesh/bullmq/compare/v5.7.0...v5.7.1) (2024-04-10) ### Bug Fixes * **worker:** use 0.002 as minimum timeout for redis version low... — committed to taskforcesh/bullmq by semantic-release-bot 3 months ago
- fix(worker): use 0.002 as minimum timeout for redis version lower than 7.0.8 (#2515) fixes #2466 — committed to taskforcesh/bullmq by roggervalf 3 months ago
- perf(worker): do not call bzpopmin when blockDelay is lower or equal 0 (#2544) ref #2466 — committed to taskforcesh/bullmq by roggervalf 2 months ago
- chore(release): 5.7.6 [skip ci] ## [5.7.6](https://github.com/taskforcesh/bullmq/compare/v5.7.5...v5.7.6) (2024-04-27) ### Bug Fixes * **redis-connection:** increase redis retry strategy backoff ([... — committed to taskforcesh/bullmq by semantic-release-bot 2 months ago
5.7.6 fixes it for us
Unfortunately we are still experiencing the issue with the latest version. Could you please let me know which version introduced this issue so we can downgrade to that specific version.