transformers: tf.keras.models.load_model() does not load saved model that includes TFOpenAIGPTLMHeadModel layer

  • transformers version: 3.1.0
  • Platform: linux
  • Python version: 3
  • Tensorflow version: 2.3.0

To reproduce

Steps to reproduce the behavior:

  1. Load the model with TFOpenAIGPTLMHeadModel
  2. Add input layers
  3. save the model
  4. Load saved model
from transformers import TFOpenAIGPTLMHeadModel
import tensorflow as tf

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('./trans_model', from_pt=True) # ./trans_model is the directory including pre-trained model from pytorch
max_len = None

input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids_layer', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids_layer', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
keras_model.save("./saved_model")
print('**************************')
model = tf.keras.models.load_model("./saved_model")
Traceback (most recent call last):
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 395, in assert_same_structure
    expand_composites)
ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "interact_test.py", line 208, in <module>
    run()
  File "interact_test.py", line 180, in run
    model = tf.keras.models.load_model("./saved_model")
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
    return saved_model_load.load(filepath, compile, options)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
    path, options=options, loader_cls=KerasObjectLoader)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
    ckpt_options)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
    self._load_all()
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 221, in _load_all
    self._finalize_objects()
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 526, in _finalize_objects
    _finalize_saved_model_layers(layers_revived_from_saved_model)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 706, in _finalize_saved_model_layers
    inputs = infer_inputs_from_restored_call_function(call_fn)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 985, in infer_inputs_from_restored_call_function
    spec = nest.map_structure(common_spec, spec, spec2)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 629, in map_structure
    expand_composites=expand_composites)
  File "/home/t9kuser/.local/lib/python3.6/site-packages/tensorflow/python/util/nest.py", line 402, in assert_same_structure
    % (str(e), str1, str2))
ValueError: The two structures don't have the same nested structure.

First structure: type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')

Second structure: type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}

More specifically: Substructure "type=dict str={'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}" is a sequence, while substructure "type=TensorSpec str=TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs')" is not
Entire first structure:
.
Entire second structure:
{'input_ids': .}

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 33 (13 by maintainers)

Most upvoted comments

This issue has been automatically marked as stale and been closed because it has not had recent activity. Thank you for your contributions.

If you think this still needs to be addressed please comment on this thread.

I cannot really test because I don’t have your trans_model but as far as I can say it is not working because you are using the high level API (with Keras) to create a saved_model. For models with custom layers it is recommended to use the low level way, like this:

from transformers import TFOpenAIGPTLMHeadModel
import tensorflow as tf

tf_model = TFOpenAIGPTLMHeadModel.from_pretrained('openai-gpt')
max_len = None

input_ids = tf.keras.layers.Input(shape=(max_len,), name='input_ids_layer', dtype='int32')
token_type_ids = tf.keras.layers.Input(shape=(max_len,), name='token_type_ids_layer', dtype='int32')
keras_input = [input_ids, token_type_ids]

qa_output = tf_model(input_ids, token_type_ids=token_type_ids)[0]
keras_model = tf.keras.Model(inputs= keras_input, outputs = qa_output)
keras_model.summary()
tf.saved_model.save("./saved_model")
print('**************************')
model = tf.saved_model.load("./saved_model")

For me this works well.