keras: "bad marshal data" when loading model that was saved with python 2.7 into python 3.4.

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

I trained a model using python 2.7, and now I need to load it using python 3.4. The model includes a simple Lambda layer. The simplified script below reproduces the error:

from keras.models import Model, load_model
from keras.layers import Input, Lambda
import sys

if sys.version_info < (3,4):
    inp = Input(shape=(28,28,1))
    x = Lambda(lambda x: x + 1)(inp)
    model = Model(inp, x)
    model.save('lambdamodel.hdf5')
else:
    model = load_model('lambdamodel.hdf5') # Error here.
    model.summary()

The model gets created and saved fine in a python2.7 virtualenv:

(py2keras) kzh@otter:tmp$ python --version
Python 2.7.12
(py2keras) kzh@otter:tmp$ pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
backports.weakref (1.0rc1)
bleach (1.5.0)
funcsigs (1.0.2)
h5py (2.7.0)
html5lib (0.9999999)
Keras (2.0.6)
Markdown (2.6.8)
mock (2.0.0)
numpy (1.13.1)
pbr (3.1.1)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scipy (0.19.1)
setuptools (36.2.3)
six (1.10.0)
tensorflow (1.2.1)
Theano (0.9.0)
Werkzeug (0.12.2)
wheel (0.29.0)
(py2keras) kzh@otter:tmp$ python lambdabug.py 
Using TensorFlow backend.
Done

The error comes up when loading in a python3 virtualenv:

(py3keras) kzh@otter:tmp$ python --version
Python 3.5.2
(py3keras) kzh@otter:tmp$ pip list
backports.weakref (1.0rc1)
bleach (1.5.0)
h5py (2.7.0)
html5lib (0.9999999)
Keras (2.0.6)
Markdown (2.6.8)
numpy (1.13.1)
pip (9.0.1)
protobuf (3.3.0)
PyYAML (3.12)
scipy (0.19.1)
setuptools (36.2.3)
six (1.10.0)
tensorflow (1.2.1)
Theano (0.9.0)
Werkzeug (0.12.2)
wheel (0.29.0)
(py3keras) kzh@otter:tmp$ python lambdabug.py 
Using TensorFlow backend.
Traceback (most recent call last):
  File "lambdabug.py", line 11, in <module>
    model = load_model('lambdamodel.hdf5')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/models.py", line 233, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/models.py", line 307, in model_from_config
    return layer_module.deserialize(config, custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/engine/topology.py", line 2476, in from_config
    process_layer(layer_data)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/engine/topology.py", line 2462, in process_layer
    custom_objects=custom_objects)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/layers/core.py", line 697, in from_config
    function = func_load(config['function'], globs=globs)
  File "/home/kzh/.envs/py3keras/lib/python3.5/site-packages/keras/utils/generic_utils.py", line 200, in func_load
    code = marshal.loads(code.encode('raw_unicode_escape'))
ValueError: bad marshal data (unknown type code)

I’ve never used marshal directly myself and don’t have time to dig much further into this. In the meantime I’ll keep using python2.7 for the code I was planning to move to 3.4. Any tips or fixes are appreciated.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 40 (1 by maintainers)

Most upvoted comments

Guys, it is solved when I updated both Keras and tensorflow to the latest versions pip install --upgrade tensorflow pip install --upgrade keras

One (somewhat hacky) fix is the following: if you can recreate the architecture (i.e. you have the original code used to generate it), you can instantiate the model from that code and then use model.load_weights('your_model_file.hdf5') to load in the weights. This works for me but it isn’t an option if you don’t have the code used to create the original architecture.

Just in case anyone else gets here and is as frustrated as I was… This is not caused by Keras encoding of the model but by the serialization of the custom objects. So the only way of solving this is actually by re-creating the model with the custom objects in python 3.x, loading the weights and saving the model again.

code will look something like:

# create_model returns the model
m = create_model((512,512,3), 2)
m.load_weights('xxx.h5') # note that weights can be loaded from a full save, not only from save_weights file
m.save('xxx_3.5.h5'')

One solution is to use Tensorflow Native format for saving. i.e

model.save('filename') # instead of model.save('filename.h5')

and now when you will load the model you can do it with custom objects even

loaded_1 = keras.models.load_model(
    "filename", custom_objects={"CustomModel": CustomModel} # can be custom layer or anything
)

https://www.tensorflow.org/guide/keras/save_and_serialize

@anentropic

Is marshall the best choice to be using here?

Absolutely not IMO. Using marshal breaks load_model across even minor versions of Python, e.g. 3.5 <-> 3.6! I am not aware of any benefit to marshal over pickle.

Is marshall the best choice to be using here?

https://docs.python.org/3/library/marshal.html

The marshal module exists mainly to support reading and writing the “pseudo-compiled” code for Python modules of .pyc files. Therefore, the Python maintainers reserve the right to modify the marshal format in backward incompatible ways should the need arise. If you’re serializing and de-serializing Python objects, use the pickle module instead – the performance is comparable, version independence is guaranteed, and pickle supports a substantially wider range of objects than marshal.

I got the same problem. I’ve tried to upgrade both keras and tf but still doesn’t work. I have a self-defined layer in my model, not sure if this is the reason why cause the problem. Anyone knows how to fix? Massive thanks.

Solved while I upgraded to latest version of Keras(2.1.4)

Thank you @q-ycong-p! The solution works perfectly for TF version 2.9 and python version 3.9.1.

Just use : model.save('file') instead of model.save('file.h5')

One solution is to use Tensorflow Native format for saving. i.e

model.save('filename') # instead of model.save('filename.h5')

and now when you will load the model you can do it with custom objects even

loaded_1 = keras.models.load_model(
    "filename", custom_objects={"CustomModel": CustomModel} # can be custom layer or anything
)

https://www.tensorflow.org/guide/keras/save_and_serialize

Thank you @ahmadmustafaanis. For my case, model was instantiated, trained and saved with Python3.6, and the .h5 failed to load_model on Python3.8. With ahmadmustafaanis’s suggestion, I was able to workaround the this error without re-instantiating the architecture with original codes:

# in Python3.6
keras_model = tf.keras.models.load_model("old.h5") # succeeds, because we're using python3.6
keras_model.save("saved_model_path")

# in Python3.8
keras_model = tf.keras.models.load_model("old.h5") # fails with bad marshal error
keras_model = tf.keras.models.load_model("saved_model_path") # succeeds loading SavedModel format

Based on the insightful discussions above, I’d imagine serialization/deserialization of SavedModel format is different from that of HDF5 format - the former might not run into marshal module’s backward-compatibility issue? Just sharing my observation. Please someone correct me if I’m wrong.

Train with python2.7.4 + tensorflow 1.2.0(maybe) + keras(unknown). Reload with python3.6 tf 1.8 keras 2.1.6. @alexklibisz 's answer sloved my problem. I first try to use “model.to_json” and “model_from_json(json_string)” to rebuild the model architecture, but failed. Then I run the init code again(Model, Dense, Relu etc…) and then “model.load_weights”, works now.