keras: TypeError: can't pickle _thread.lock objects

Information:

Keras version 2.0.8
Tensorflow version 1.3.0
Python 3.6

Minimal example to reproduce the error:

from keras.layers import Input, Lambda, Dense
from keras.models import Model
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import tensorflow as tf
import numpy as np

x = Input(shape=(30,3))
low = tf.constant(np.random.rand(30, 3).astype('float32'))
high = tf.constant(1 + np.random.rand(30, 3).astype('float32'))
clipped_out_position = Lambda(lambda x, low, high: tf.clip_by_value(x, low, high),
                                      arguments={'low': low, 'high': high})(x)

model = Model(inputs=x, outputs=[clipped_out_position])
optimizer = Adam(lr=.1)
model.compile(optimizer=optimizer, loss="mean_squared_error")
checkpoint = ModelCheckpoint("debug.hdf", monitor="val_loss", verbose=1, save_best_only=True, mode="min")
training_callbacks = [checkpoint]
model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)

Error output:

Train on 67 samples, validate on 33 samples
Epoch 1/50
10/67 [===>..........................] - ETA: 0s - loss: 0.1627Epoch 00001: val_loss improved from inf to 0.17002, saving model to debug.hdf
Traceback (most recent call last):
  File "debug_multitask_inverter.py", line 19, in <module>
    model.fit(np.random.rand(100, 30, 3), [np.random.rand(100, 30, 3)], callbacks=training_callbacks, epochs=50, batch_size=10, validation_split=0.33)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1631, in fit

▽
    validation_steps=validation_steps)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/training.py", line 1233, in _fit_loop
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/callbacks.py", line 414, in on_epoch_end
    self.model.save(filepath, overwrite=True)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2556, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/models.py", line 107, in save_model
    'config': model.get_config()
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/site-packages/keras/engine/topology.py", line 2397, in get_config
    return copy.deepcopy(config)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 215, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/om/user/lnj/openmind_env/tensorflow-gpu/lib/python3.6/copy.py", line 169, in deepcopy
    rv = reductor(4)
TypeError: can't pickle _thread.lock objects

It seems like the error has occurred in the past in different contexts here, but I’m not dumping the model directly – I’m using the ModelCheckpoint callback. Any idea what could be going wrong?

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 21
Comments: 24 (1 by maintainers)

Most upvoted comments

Need to see your code. Basic idea is to check for loops in your data references,

+23

lyxm on Apr 17, 2018

This exception is raised mainly because you’re trying to serialize an unserializable object. In the context, the “unserializable” object is the tf.tensor.

So remember this: Don’t let raw tf.tensors wandering in your model.

For my case, I’m trying to use K.shape() to acquire a shape of a tensor, and reuse it later, like this:

        x_shape = K.shape(x)
        x = SomeLayers(x)
        x = Lambda( lambda x: K.reshape(x, [x_shape[0], x_shape[1]]))(x)

x_shape is a tensorflow tensor, it is not associated with any other keras layers. That’s why I call it a lonely wandering tensor. It will cause the “can’t pickle _thread.RLock objects” error.

A possible solution would be:

    x_shape = Lambda(lambda x: K.shape(x), outpute_shape=(you_should_know, ))(x)
    x = SomeLayers(x)
    x = Lambda(lambda xs: K.reshape(xs[0], [xs[1][0], xs[1][1]]), output_shape=(you_should_know))([x, x_shape])

No wandering tensorflow tensors, No errors.

+12

MoyanZitto on Mar 7, 2019

I also face this problem (my model has Lambda layer). I can avoid this error by setting save_weights_only in ModelCheckpoint.

icoxfog417 on May 28, 2018

Found the solution on StackOverflow: https://stackoverflow.com/questions/44855603/typeerror-cant-pickle-thread-lock-objects-in-seq2seq

f00r on Jan 3, 2018

class Model:
    .....

    def make_model(self):
        .....
        x = Input(shape=(self.input_dim,))
        z_mean = Dense(self.latent_dim, activation='elu')(x)
        z_log_var = Dense(self.latent_dim, activation='elu')(x)

        # this makes it work
        latent_dim = self.latent_dim
        epsilon_std = self.epsilon_std
        
        def sampling(args):
            z_mean, z_log_var = args
            epsilon = K.random_normal(shape=(latent_dim,),
                                      mean=0., stddev=epsilon_std)
            return z_mean + K.exp(z_log_var/2) * epsilon

        z = Lambda(sampling, output_shape=(self.latent_dim,))([z_mean, z_log_var])

The code above works.

If you refer to self.latent_dim and self.epsilon_std directly inside sampling, it would complain.

lyxm on Apr 27, 2018

See this answer on Stackoverflow.

alar0330 on Mar 19, 2019

Python cannot pickle lambda expressions. You may want to try replacing them (e.g. the one you passed to your Lambda layer) with named functions, as @lyxm suggested.

evilnose on Jun 25, 2018

I looked into it some more and it seems to have something to with the Lambda layer when it hits this line. Attempting to get_config on the Lambda layer seems to lock the config object somehow?

Edit: It seems like this is only when the arguments are TensorFlow tensors. Is there any way to get around this issue so I can wrap a tf function in Lambda layer?

jlin816 on Nov 2, 2017

I am facing the same issue still and not able to solve it using the above mentioned solutions. Can someone please advise for my case mentioned here: https://stackoverflow.com/questions/57233539/typeerror-cant-pickle-thread-rlock-objects

KashyapCKotak on Jul 29, 2019

I have a similar error message using Python 3.6 on Windows but this is not related to keras. Some things to try that might help you trace the cause.

Try Python 3.7 on Windows. In my project this works, but Windows Python 3.6 does not.
Try Python 3 on Linux. I have used Python 3.5 on Linux with the same code and it does not raise this error.
In Linux try adding the line: multiprocessing.set_start_method(‘spawn’) and this may break the Linux code if it was previously working.

I am doing a multiprocessing project and developing on a Windows computer but ultimately deploying on a Raspberry Pi (Linux) PC. It has been tricky to get everything to work, but especially on Windows.

themzlab on Aug 13, 2018

Hitting this problem with VAE, without any of GRU/RNN/LSTM. Strange thing is it was fine before, and only starts acting up after I put the model into a class.

Fixed. VAE has a lambda, which refers to data attributes in class. Assign them to locals and let lambda use only locals.

lyxm on Mar 29, 2018

It looks like it works by changing return copy.deepcopy(config) to just return config in Model.get_config() in topology.py (Keras 2.1.6). However, is there any consequence in doing that?

=====

Just to be safe, instead of doing the above directly, I added a parameter no_deep_copy in Model.get_config() which defaults to be False, and changed the return statement to be return config if no_deep_copy else copy.deepcopy(config). I then changed the 'config': model.get_config() line (which caused the problem; see traceback above) to 'config': model.get_config(no_deep_copy=True) in save_model(model, filepath, overwrite=True, include_optimizer=True) in models.py(Keras 2.1.6). Could you let me know if there is any consequence in doing that? Thanks!

ZhaofengWu on Jul 5, 2018