transformers: Fail to convert the Funnel Transformer tensorflow version to transformer one when use the official script

Environment info

  • transformers version:3.5.1
  • Platform:Centos
  • Python version:3.7
  • PyTorch version (GPU?):1.6.0
  • Tensorflow version (GPU?):2.3.2
  • Using GPU in script?:yes
  • Using distributed or parallel set-up in script?:yes

Information

Model I am using (Bert, XLNet …):Funnel Transformer

To reproduce

Steps to reproduce the behavior:

1.use the script convert_funnel_original_tf_checkpoint_to_pytorch.py@sgugger @LysandreJik raise error

Traceback (most recent call last):
  File "run_pretraining.py", line 158, in <module>
    convert_tf_checkpoint_to_pytorch(args.tf_checkpoint_path, args.config_file, args.pytorch_dump_path)
  File "run_pretraining.py", line 40, in convert_tf_checkpoint_to_pytorch
    load_tf_weights_in_funnel(model, config, tf_checkpoint_path)
  File "run_pretraining.py", line 122, in load_tf_weights_in_funnel
    pointer = getattr(pointer, _layer_map[m_name])
  File "/root/miniconda3/envs/transformers/lib/python3.7/site-packages/torch/nn/modules/module.py", line 772, in __getattr__
    type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'FunnelForPreTraining' object has no attribute 'embeddings'

Expected behavior

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 15 (4 by maintainers)

Most upvoted comments

@LysandreJik Which config file is using, I use the original full version tensorflow one net_config.json

{
    "block_size": "6_6_6",
    "d_embed": 768,
    "d_head": 64,
    "d_inner": 3072,
    "d_model": 768,
    "decoder_size": "2",
    "dropact": 0.0,
    "dropatt": 0.1,
    "dropout": 0.1,
    "ff_activation": "gelu",
    "init": "truncated_normal",
    "init_range": 0.1,
    "init_std": 0.02,
    "n_head": 12,
    "pool_q_only": true,
    "pooling_size": 2,
    "pooling_type": "mean",
    "rel_attn_type": "factorized",
    "separate_cls": true,
    "vocab_size": 21128
}

I got the same problem as you and I manage to convert the checkpoint by using the config file at the hugging face model hub. If you use 6-6-6 block use this one https://huggingface.co/funnel-transformer/intermediate/raw/main/config.json and change vocab size.