apscheduler: BackgroundScheduler.get_jobs() hangs when used with Flask and Sqlalchemy
Motivation
On startup, I’d like to be able to add a persistent job store and add a job only if that job store is empty.
What works
With sqlalchemy, but without flask, the following code works:
from time import sleep
import sqlalchemy as sa
# Connect to example.sqlite and add a new job only if there are no jobs already.
from apscheduler.schedulers.background import BackgroundScheduler
log = print
engine = sa.create_engine('sqlite:///{}'.format('example.sqlite'))
def alarm():
print('Alarm')
if __name__ == '__main__':
scheduler = BackgroundScheduler()
log("created scheduler")
scheduler.add_jobstore('sqlalchemy', engine=engine)
log("Added jobstore")
scheduler.start()
log("Started scheduler")
if not scheduler.get_jobs():
log("Added job")
scheduler.add_job(alarm, 'interval', seconds=20)
else:
log("Didn't add job.")
try:
while True:
sleep(2)
except (KeyboardInterrupt, SystemExit):
pass
With Flask but without sqlalchemy, the following works. It doesn’t make use of persistent storage, of course, but get_jobs() will return []
.:
import os
from time import sleep
import flask
from apscheduler.schedulers.background import BackgroundScheduler
# Verify that apscheduler works with flask, as long as we don't use
# persistent storage.
# run with
# & { $env:FLASK_APP='demo.py'; $env:FLASK_DEBUG=1; python -m flask
run}
app = flask.Flask(__name__)
log = app.logger.info
log("Created App")
def alarm():
print('Alarm')
if not app.debug or os.environ.get("WERKZEUG_RUN_MAIN") == 'true':
scheduler = BackgroundScheduler()
log("created scheduler")
scheduler.start()
log("Started scheduler")
if not scheduler.get_jobs():
app.logger.info("Added job")
scheduler.add_job(alarm, 'interval', seconds=20)
else:
app.logger.info("Didn't add job.")
What doesn’t work:
When I try to add a persistent job store to this flask app, scheduler.get_jobs()
hangs:
import os
from time import sleep
import sqlalchemy as sa
import flask
from apscheduler.schedulers.background import BackgroundScheduler
# Hangs at get_jobs()
app = flask.Flask(__name__)
log = app.logger.info
log("Created App")
### NEW ###
engine = sa.create_engine('sqlite:///{}'.format('example.sqlite'))
###########
def alarm():
print('Alarm')
# Don't create two schedulers when running in debug mode.
if not app.debug or os.environ.get("WERKZEUG_RUN_MAIN") == 'true':
scheduler = BackgroundScheduler()
log("created scheduler")
### NEW ###
scheduler.add_jobstore('sqlalchemy', engine=engine)
log("Added jobstore")
###########
scheduler.start()
log("Started scheduler")
if not scheduler.get_jobs():
app.logger.info("Added job")
scheduler.add_job(alarm, 'interval', seconds=20)
else:
app.logger.info("Didn't add job.")
Environment
Windows 10 Python 3.5.3
Running in a virtual environment with:
Package Version
------------ -------
APScheduler 3.4.0
click 6.7
Flask 0.12.2
itsdangerous 0.24
Jinja2 2.10
MarkupSafe 1.0
pip 9.0.1
pytz 2017.3
setuptools 37.0.0
six 1.11.0
SQLAlchemy 1.1.15
tzlocal 1.4
Werkzeug 0.12.2
wheel 0.30.0
Edit:
This also appears on Linux with 3.6.2. All the package versions are the same.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 23 (9 by maintainers)
Thanks for responding @agronholm. I dropped flask-apscheduler and am now using APScheduler directly. I am able to replicate the issue. I’ve done some debugging and have found a deadlock caused in relation to
_jobstores_lock
.After starting APScheduler a thread is spun off which executes
_process_jobs
, this sets the_jobstores_lock
here: https://github.com/agronholm/apscheduler/blob/cbf2eeb21695343c1996e59732adbc8fbbab6842/apscheduler/schedulers/base.py#L929A query executed within the context of this lock by a function called
get_due_jobs
never returns. I followed the execution down to the last executed line and it seems there is an issue receiving a new cursor from SQLAlchemy’s connection pool (https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/pool.py#L970).Simultaneously, while the ^^^ request hangs, the main thread executes
add_job
which also requests a lock from_jobstores_lock
. That lock request occurs here: https://github.com/agronholm/apscheduler/blob/cbf2eeb21695343c1996e59732adbc8fbbab6842/apscheduler/schedulers/base.py#L428My config
init.py
config/default.py
Output
Yes this is still relevant. I ran into this issue a few days ago using gunicorn, flask, apscheduler and sqlalchemy.
ref_to_obj (called by
get_jobs
) was able to import correctly before we add a job. However, as long asadd_job
is invoked once, it no longer works. we also have the same issue when we try toadd_job
using flask/sqlalchemy/apscheduler. The lock is not released after a job is added (in the sense the firstadd_job
returns), so we cannot add the second job (secondadd_job
hangs). Since other ppl have the same issue, just wondering if any one has solved this issue.I’m experiencing a very similar issue using APScheduler through the https://github.com/viniciuschiele/flask-apscheduler project. I too am trying to use the SQLAlchemy jobstore, in my case backed by a postgresql DB. I initialize and start the scheduler, and try to call
add_job
, which causes my project to hang. If I disable the SQLAlchemy job store, all works as expected.Environment
Mac 10.13.1 Python 2.7.13
I met a problem like https://github.com/unbit/uwsgi/issues/844 . And solved by add --enable-threads. MIL issues Maybe help. DEBUG should be false in pro env.
This is definitely still an issue, but it’s not with APScheduler, but with the way most Flask apps use a Global
app
instance and that the docs for Flask-APScheduler shows callingstart
and then defining your tasks.root cause is that when trying to load the state data from the job_store, the module that is trying to be imported with
__import__
, hasn’t actually been read by the interpreter yet and thus isn’t in the name space. Why’s it hangs? That’s a good question, I believe since the module is not done being fully loaded, it enters some kind of circular dependency that the interpreter can’t detect.By simply ensuring that the entire module is loaded before calling
scheduler.start()
resolved this issue for me.I placed
scheduler.start()
at the bottom of my module file where I definedscheduler
.Just curious, anyone knows what root cause is? we have the same issue: add_jobs hangs there with a flask/sqlalchemy/apscheduler combination. we traced back to
__import__
inref_to_obj
inutil.py
, which hangs forever.