transformers: tf.keras.models.load_model() does not load saved model that includes TFOpenAIGPTLMHeadModel layer
transformersversion: 3.1.0- Platform: linux
- Python version: 3
- Tensorflow version: 2.3.0
To reproduce
Steps to reproduce the behavior:
- Load the model with TFOpenAIGPTLMHeadModel
- Add input layers
- save the model
- Load saved model
from transformers import TFOpenAIGPTLMHeadModel
import tensorflow as tf
tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('./trans_model', from_pt=True) # ./trans_model is the directory including pre-trained model from pytorch
max_len = None
input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids_layer', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids_layer', dtype='int32')
keras_input = [input_ids, token_type_ids]
qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
keras_model.save("./saved_model")
print('**************************')
model = tf.keras.models.load_model("./saved_model")
Traceback (most recent call last):
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 395, in assert_same_structure
expand_composites)
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "interact_test.py", line 208, in <module>
run()
File "interact_test.py", line 180, in run
model = tf.keras.models.load_model("./saved_model")
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
return saved_model_load.load(filepath, compile, options)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
path, options=options, loader_cls=KerasObjectLoader)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
ckpt_options)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
super(KerasObjectLoader, self).__init__(*args, **kwargs)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
self._load_all()
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 221, in _load_all
self._finalize_objects()
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 526, in _finalize_objects
_finalize_saved_model_layers(layers_revived_from_saved_model)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 706, in _finalize_saved_model_layers
inputs = infer_inputs_from_restored_call_function(call_fn)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 985, in infer_inputs_from_restored_call_function
spec = nest.map_structure(common_spec, spec, spec2)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 629, in map_structure
expand_composites=expand_composites)
File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 402, in assert_same_structure
% (str(e), str1, str2))
ValueError: The two structures don't have the same nested structure.
First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')
Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}
More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 33 (13 by maintainers)
This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.
If you think this still needs to be addressed please comment on this thread.
I cannot really test because I don’t have your
trans_modelbut as far as I can say it is not working because you are using the high level API (with Keras) to create a saved_model. For models with custom layers it is recommended to use the low level way, like this:For me this works well.