django-celery: Scheduled tasks are being duplicated

I’m not 100% how to get around to rooting the issue here so I’ll try to give as much info as possible, so that hopefully someone can help me figure out if this is a celery bug or a bug in my configuration.

I have a site that goes through and creates a “weekly status” post for each blog on the site. Sometimes the weekly status post is done 2x, 3x and it’s even happend 4x.

The task is defined as such in my code:

# tasks.py

@periodic_task(run_every=crontab(hour=0, minute=0, day_of_week="monday"))
def weekly_report():
    # abridged for brevity
    for blog in blogs:
        do_weekly_report()

I’m not 100% certain what’s happening here.

I did notice this duplicate behavior around the time I started allowing people to “schedule” reminders.

Basically an activity can be scheduled at some point in the future, and they will be reminded 1 hour before the activity is scheduled to be run. Maybe I’m scheduling the reminders incorrectly.

I have a listener that listens for activities being saved, if they are in the future, and the user has setup reminders, then a task will be scheduled in the future.

  • if the activity is created, schedule it
  • if the activity is being changed, look to see if the scheduled date has changed.
    • if it has changed, delete the queued task, and replace it with a new one
    • if it hasn’t changed, just save the activity and leave the task alone.

This seems to work just fine, cause I’ve had no complaints from anyone about the reminders working as advertised, but the weekly periodic task repeats sometimes. (sometimes not at all… sometimes upto 4 times)

looking at ps aux | grep celery this is what I see right now…

nobody   15988  0.2  8.9 312656 45156 ?        Sl   04:11   1:44 python /manage.py celeryd -v 2 -B -s celery -E -l INFO
nobody   16020  0.0  8.8 271408 44304 ?        S    04:11   0:01 python /manage.py celeryd -v 2 -B -s celery -E -l INFO
nobody   16021  0.0  7.2 263696 36320 ?        S    04:11   0:01 python /manage.py celeryd -v 2 -B -s celery -E -l INFO
nobody   16022  0.0  7.4 264536 37528 ?        S    04:11   0:00 python /manage.py celeryd -v 2 -B -s celery -E -l INFO
nobody   16023  0.0  8.7 271468 43736 ?        S    04:11   0:02 python /manage.py celeryd -v 2 -B -s celery -E -l INFO
nobody   16028  0.0  7.1 295972 35928 ?        S    04:11   0:18 python /manage.py celeryd -v 2 -B -s celery -E -l INFO

This is all running on 1 box, with a redis back end.

About this issue

  • Original URL
  • State: closed
  • Created 12 years ago
  • Comments: 25 (5 by maintainers)

Most upvoted comments

Hi fdx, did you manage to fix this problem? Looking at your processes, could it be something to do with the fact that you are running the celerybeat scheduler (with the celeryd -B option) multiple times? How are you launching your worker processes? I have mistakenly run two workers with celerybeat before, and it led to duplication of tasks as each one was being scheduled twice.

In the Celery docs for celeryd:

-B, --beat Also run the celerybeat periodic task scheduler. Please note that there must only be one instance of this service.

There is a good explanation of this problem in this article: http://rdegges.com/devops-django-part-3-the-heroku-way

Hope you’ve already found a solution to your problem, but if not maybe this will help!

yes.

example

set timeout for 1 hour schedule a task 20 hours later task gets duplicated 19 times.

set timeout 48 hours schedule task 20 hours later task goes normally.

On 8 November 2015 at 20:43, ubhisat notifications@github.com wrote:

@pa-jama https://github.com/pa-jama do you mean this setting? http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#visibility-timeout

I am having similar issues.

— Reply to this email directly or view it on GitHub https://github.com/celery/django-celery/issues/215#issuecomment-154822044 .

I changed the redis TIMEOUT setting or whatever its called, made it higher, and the problem went away.

On 30 September 2015 at 22:53, Alexeew Artemiy notifications@github.com wrote:

I change RabbitMQ to Redis, work fine.

— Reply to this email directly or view it on GitHub https://github.com/celery/django-celery/issues/215#issuecomment-144343297 .

Cheers, Ben

I run into the same problem, finding that scheduled tasks were triggered many times at once. Finally, the reason turn out to be:

  1. I start one worker, using script, with -B. Normally I only start one and I restart it with script.
  2. But, sometimes when I try to restart the worker, it takes to long to waiting for the worker to end nicely. SO I KILLED IT (BUT I FAILED), and then start it again.

so, actually there are more than one worker running, with many celerybeats running.

solution:

I kill all celerybeat and celeryd processes, make sure it’s cleaned. and start worker again.