apscheduler: Using sqlalchemy datastore with another sqalchem instance "TypeError: can't pickle _thread._local objects"

Application is throwing TypeError: can't pickle _thread._local objects when I am using another sqlalchemy instance

Expected Behavior

apscheduler should work.

Current Behavior

I have application that instantiates a class that has methods that use sqlalchemy

Steps to Reproduce

  1. Use whatever scheduler with sqlalchemy backed
  2. Create another modules with a class that has method that uses sqlalchemy
  3. Start scheduler and call a method that uses sqlalchemy

Dump

TypeError: can't pickle _thread._local objects
apscheduler.scheduler - INFO - Adding job tentatively -- it will be properly scheduled when the scheduler starts
Traceback (most recent call last):
  File "scheduler.py", line 65, in <module>
    sched.start()
  File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\apscheduler\schedulers\background.py", line 33, in start
    BaseScheduler.start(self, *args, **kwargs)
  File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\apscheduler\schedulers\base.py", line 162, in start
    self._real_add_job(job, jobstore_alias, replace_existing)
  File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\apscheduler\schedulers\base.py", line 867, in _real_add_job
    store.add_job(job)
  File "C:\Users\admin\AppData\Local\Programs\Python\Python37\lib\site-packages\apscheduler\jobstores\sqlalchemy.py", line 95, in add_job
    'job_state': pickle.dumps(job.__getstate__(), self.pickle_protocol)
TypeError: can't pickle _thread._local objects

If you need more info just let me know

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 19 (10 by maintainers)

Most upvoted comments

@skjoher @ersaijun @ziyueWu @aidiss

If you run into this issue and don’t know how to change your architecture. then read the following:

  • There’s something in your code that can’t be pickled - ie. it cannot be “passed” from one process to another. Probably that’s a member field of a class.
  • You need to figure out what it is - normally the error log will give you some hints, although not always. Look around your class code and try to think what fields are you storing. Could any of these cause issues with pickling? Primitive objects are easy to pickle, you want to look for some complex stuff. Maybe you’re storing a complex uWSGI app in your class? Things like that can cause problems.
  • The easiest way to solve this is to ‘destroy’ and ‘recreate’ these complex fields before and after pickling respectively. Consider the following class:
class MyClass():
    def __init__(self):
        self.ssl_context = ssl.SSLContext()
        
    def run_task(self):
        print('running task')

my_class = MyClass()
scheduler = BlockingScheduler({'default': ProcessPoolExecutor())
scheduler.add_job(my_class.run_task)

We create an instance of MyClass that contains a field that cannot be pickled - self.ssl_context. We then attempt to schedule a the run_task method using APS, which like Alex outlined above will attempt to pickle all fields of the my_class instance.

To solve this, let’s avoid pickling ssl_context and recreate it manually upon unpickling instead:

class MyClass():
    def __init__(self):
        self.ssl_context = ssl.SSLContext()

    def run_task(self):
        print('running task')

    # will be called on pickling
    def __getstate__(self):
        state = self.__dict__.copy()
        del state['ssl_context'] # remove the unpicklable ssl_context
        return state

    # will be called on unpickling
    def __setstate__(self, state):
        self.__dict__.update(state)
        self.ssl_context = ssl.SSLContext() # recreate the ssl_context 

This method then applies to whatever is causing your pickling issues. Look around your code, find out what can’t be pickled and use this destroy-and-rebuild method. Bear in mind that this may not work for every case (as some objects cannot be rebuilt like this correctly), but chances are your class can indeed be rewritten. Remember that this rebuilding can use objects that were successfully pickled by accessing the state!

Funny enough, the APS schedulers themselves cannot be pickled either. Make sure that none of your pickled classes store any APS schedulers, ie. don’t do this:

class MyClass():
    def __init__(self):
       # THIS IS A BAD IDEA!
        self.scheduler = BlockingScheduler({'default': ProcessPoolExecutor())
        self.scheduler.add_job(self.run_task)

    def run_task(self):
        print('running task')

Or if you do, make sure to delete the scheduler on pickling, and recreate it (including its jobs!) on unpickling.

You can read more in this article on Medium.

Good luck! 👋

Thanks. I understand now.

so,I canot use jobstores when I add a job of an instance. The code like this:

m_scheduler = BackgroundScheduler(jobstores=jobstores)
m_scheduler.add_job(api_audio.stop_program, id='job_date_once',
		                    trigger='date',
		                    run_date='2020-08-06 20:06:05', replace_existing=True, )

api_audio = API_audio()  # an instance of class
m_scheduler.start()

it will cause error : TypeError: can’t pickle _thread.lock objects

Is there any method to solve the problem?

Serializing means converting an in-memory object to a bytestream that can be saved to persistent storage and later restored from there. An instance method is a function bound to an instance of a class. When you schedule a job that targets an instance method, that instance must be serialized along with the reference to the method definition in order for the restoration to be possible.