pg-boss: No retry for failed jobs

I am using done(error) if an error occurs. retrylimit = 2 But it seems to ignore the retry settings for a failed job.

Is this on purpose?

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 2
Comments: 20 (13 by maintainers)

Most upvoted comments

Thanks, guys. We were using Kue when I wrote pg-boss, actually. It was my inspiration along with HangFire. We had support challenges with the Redis windows port (I know, right?) at the time, and then when I read into the docs about how to “tune” Redis to not lose writes if it were to crash, I was convinced it was just an architectural mismatch to rely too much on a system that didn’t offer guaranteed writes. This is why I drop those hints and the redis docs link in the readme.

Feel free to make any suggestions on desired features. I’ve mostly built this for my own selfish consumption and needs, but I do like the idea of proving the “you can’t build a queue in a database” crowd wrong. 😃

I know I will find a breaking point of this solution at some point, just being a “plain ole relational database” and all, but until I do, I’m going to keep pushing and optimizing it. My job processing volumes are now in the thousands, and it’s holding up quite well as long as I tune my archiving intervals down to keep the job table as lean as possible.

timgit on Feb 14, 2018

All these queues are great. Let’s have a feaure per line of code contest. Coveralls says pg-boss has 413 of 413 relevant lines covered (100.0%). How many lines of code are in your queue? :trollface:

timgit on Feb 14, 2018

This has become more annoying in my own projects lately as my orchestrations have grown, and so has the amount of expire, fail and complete handlers. This is going to be quite disruptive, no doubt.

Right now, my thoughts on the new api are consolidate api to onComplete() only, which has request and response as usual, with added an result prop or something like that with state (success or failed), then expired could just be a “reason” for the failure.

remove: onFail() and onExpire() subs

timgit on Feb 9, 2018

I have 400 megabytes node_modules, does that count?

elmigranto on Feb 14, 2018

Hey there @elmigranto. Yes, until I implement this enhancement, you will need to handle this using other techniques. I wouldn’t recommend using in-memory state unless you’re ok with losing that job.

One way in which you could build this using the existing api would be to add a subscription via onFail() and just republish the job after incrementing a retry counter in the payload. Then, on each subsequent failure, you can check the retry count and decide if you want to re-publish or not. You can use the startIn publish option for your delay like the following.

{ startIn : '15 seconds' }

timgit on Jul 6, 2017