transformers: ModuleAttributeError occurs during Converting TensorFlow Checkpoints (BERT)
Environment info
- transformersversion: 4.1.1
- Platform: Linux-4.15.0-129-generic-x86_64-with-glibc2.10
- Python version: 3.8.3
- PyTorch version (GPU?): 1.7.0 (True)
- Tensorflow version (GPU?): 2.3.1 (True)
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
Who can help
albert, bert, GPT2, XLM: @LysandreJik
Information
Model I am using (Bert, XLNet …): Bert
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- Convert TF v1 ckpt to PyTorch
To reproduce
I tried to convert a TensorFlow checkpoint, but ModuleAttributeError occurred.
What I run:
****@**** $ transformers-cli convert --model_type bert \
>   --tf_checkpoint $MODEL_DIR/model.ckpt \
>   --config ****/bert_config.json \
>   --pytorch_dump_output $MODEL_DIR/pytorch_model.bin
(In this time, bert_config.json is in a separate folder, but it corresponds to the ckpt.)
Output is:
Traceback (most recent call last):
  File "/****/.pyenv/versions/anaconda3-2020.07/bin/transformers-cli", line 8, in <module>
    sys.exit(main())
  File "/****/.pyenv/versions/anaconda3-2020.07/lib/python3.8/site-packages/transformers/commands/transformers_cli.py", line 51, in main
    service.run()
  File "/****/.pyenv/versions/anaconda3-2020.07/lib/python3.8/site-packages/transformers/commands/convert.py", line 105, in run
    convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
  File "/****/.pyenv/versions/anaconda3-2020.07/lib/python3.8/site-packages/transformers/models/bert/convert_bert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_bert(model, config, tf_checkpoint_path)
  File "/****/.pyenv/versions/anaconda3-2020.07/lib/python3.8/site-packages/transformers/models/bert/modeling_bert.py", line 155, in load_tf_weights_in_bert
    pointer.shape == array.shape
  File "/****/.pyenv/versions/anaconda3-2020.07/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in __getattr__
    raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'BertEmbeddings' object has no attribute 'shape'
Expected behavior
I think it is not strange that BertEmbeddings (nn.Module) doesn’t have shape.
Is it possible to get such an error depending on the original TensorFlow checkpoint? In such a case, is there any tips to deal with it?
I really appreciate any help you can provide.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (20 by maintainers)
Fantastic! Great job, thank you for sharing your progress!
Hmmm indeed it seems that the model doesn’t fit one-to-one to our architecture. You might need to slightly tweak the architecture and conversion script to load it, but you’re probably the most expert on the matter. If you want me to take a deeper look, feel free to send me the weights/config so I can take a look locally.
Hmmm I understand.
I don’t think it’s the
global_step, as this gets skipped here:https://github.com/huggingface/transformers/blob/b020a736c374460af1b34267283f957988350630/src/transformers/models/bert/modeling_bert.py#L120-L125
As a way to debug what’s happening here, could you add the following log statement:
right after the following line: https://github.com/huggingface/transformers/blob/b020a736c374460af1b34267283f957988350630/src/transformers/models/bert/modeling_bert.py#L116
It would then look like:
we can then try to identify what’s happening with the checkpoint.