transformers: resize_token_embeddings error for Transformer-XL

๐Ÿ› Bug

Information

Model I am using : Transformer-XL

Language I am using the model on : English

The problem arises when using:

  • my own modified scripts: a fine-tuning script for TransfoXLLMHeadModel

To reproduce

The following code aims to add two new tokens to the vocabulary, โ€˜wugโ€™ and โ€˜wugsโ€™. After doing so to the tokenizer, we call resize_token_embeddings with the model in order to update its input embeddings to have correct dimension to account for the new tokens.

import torch
from transformers import TransfoXLTokenizer, TransfoXLLMHeadModel

model = TransfoXLLMHeadModel.from_pretrained('transfo-xl-wt103')
tokenizer = TransfoXLTokenizer.from_pretrained('transfo-xl-wt103')

tokenizer.add_tokens(['wug', 'wugs'])
model.resize_token_embeddings(len(tokenizer))

Running the above gives the following error

Traceback (most recent call last):
  File "bug.py", line 9, in <module>
    model.resize_token_embeddings(len(tokenizer))
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 198, in resize_token_embeddings
    model_embeds = base_model._resize_token_embeddings(new_num_tokens)
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 213, in _resize_token_embeddings
    new_embeddings = self._get_resized_embeddings(old_embeddings, new_num_tokens)
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 234, in _get_resized_embeddings
    old_num_tokens, old_embedding_dim = old_embeddings.weight.size()
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
    type(self).__name__, name))
AttributeError: 'AdaptiveEmbedding' object has no attribute 'weight'

It seems that the function resize_token_embeddings() does not currently account for the particulars of the input embeddings used for the TransformerXLLMHeadModel.

Expected behavior

We expect that resize_token_embeddings should handle the appropriate updating of the embedding layers for the new vocabulary size, so that the model can be correctly used with the new tokens.

Thank you in advance

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 15 (11 by maintainers)

Most upvoted comments

Hi @vsieplus ,

This is a known bug and sadly we donโ€™t have a solution for this now. TransfoXLLMHead uses adaptive weight embeddings which makes it not very easy to implement this function. Should be implemented in the long run though - I will note it down. @thomwolf @LysandreJik

Thanks a lot @sgugger for answering here! As @sgugger mentioned, itโ€™d be great if you can add a _resize_token_embeddings() function to TransfoXLPreTrainedModel.

The solution looks great to me @vsieplus ๐Ÿ˜ƒ

You could make it a bit more compact, but thatโ€™s a nitpick:

    embeddings = self.get_input_embeddings()
    new_num_tokens_last = new_num_tokens - sum([emb.shape[0] for emb in embeddings.emb_layers[:-1])
    new_embeddings_last = self._get_resized_embeddings(embeddings.emb_layers[-1], new_num_tokens_last)
    embeddings.emb_layers[-1] = new_embeddings_last

    self.set_input_embeddings(embeddings)

No it should not. XLNet uses the standard nn.embedding - so it should be fine.