distributed: "can't pickle thread.lock objects" when calling array.store with distributed

I am trying to store a dask array using distributed. When I call store, I get an error “can’t pickle thread.lock objects”.

I originally was trying this in a much more complex context involving netCDF, xarray, etc. But I managed to come up with the following minimal example.

import numpy as np
import dask.array as da
from distributed import Client

def create_and_store_dask_array():
    shape = (10000, 1000)
    chunks = (1000, 1000)
    data = da.zeros(shape, chunks=chunks)
    store = np.memmap('test.memmap', mode='w+', dtype=data.dtype, shape=data.shape)
    data.store(store)
    print("Success!")

create_and_store_dask_array()
client = Client()
create_and_store_dask_array()

The first call works, but the second fails. The output is:

Success!
/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/pickle.pyc - INFO - Failed to serialize (<function store at 0x7f0ee802f488>, (<functools.partial object at 0x7f0ec84f1418>, (1000, 1000)), (slice(2000, 3000, None), slice(0, 1000, None)), <thread.lock object at 0x7f0f2c715af0>)
Traceback (most recent call last):
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/pickle.py", line 43, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/cloudpickle/cloudpickle.py", line 706, in dumps
    cp.dump(obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/cloudpickle/cloudpickle.py", line 146, in dump
    return Pickler.dump(self, obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 568, in save_tuple
    save(element)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
TypeError: can't pickle thread.lock objects
/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/core.pyc - CRITICAL - Failed to Serialize
Traceback (most recent call last):
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/core.py", line 43, in dumps
    for key, value in data.items()
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/core.py", line 44, in <dictcomp>
    if type(value) is Serialize}
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/serialize.py", line 106, in serialize
    header, frames = {}, [pickle.dumps(x)]
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/distributed/protocol/pickle.py", line 43, in dumps
    return cloudpickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/cloudpickle/cloudpickle.py", line 706, in dumps
    cp.dump(obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/site-packages/cloudpickle/cloudpickle.py", line 146, in dump
    return Pickler.dump(self, obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 568, in save_tuple
    save(element)
  File "/home/rpa/.conda/envs/lagrangian_vorticity/lib/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
TypeError: can't pickle thread.lock objects

Versions:

import dask
print dask.__version__
import distributed
print distributed.__version__
>>> 0.12.0
>>> 1.14.3

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Comments: 20 (13 by maintainers)

Commits related to this issue

Most upvoted comments

Can you try the following

# data.store(store)

from dask.utils import SerializableLock
lock = SerializableLock()
data.store(store, lock=lock)