transformers: resize_token_embeddings error for Transformer-XL

🐛 Bug

Information

Model I am using : Transformer-XL

Language I am using the model on : English

The problem arises when using:

my own modified scripts: a fine-tuning script for TransfoXLLMHeadModel

To reproduce

The following code aims to add two new tokens to the vocabulary, ‘wug’ and ‘wugs’. After doing so to the tokenizer, we call resize_token_embeddings with the model in order to update its input embeddings to have correct dimension to account for the new tokens.

import torch
from transformers import TransfoXLTokenizer, TransfoXLLMHeadModel

model = TransfoXLLMHeadModel.from_pretrained('transfo-xl-wt103')
tokenizer = TransfoXLTokenizer.from_pretrained('transfo-xl-wt103')

tokenizer.add_tokens(['wug', 'wugs'])
model.resize_token_embeddings(len(tokenizer))

Running the above gives the following error

Traceback (most recent call last):
  File "bug.py", line 9, in <module>
    model.resize_token_embeddings(len(tokenizer))
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 198, in resize_token_embeddings
    model_embeds = base_model._resize_token_embeddings(new_num_tokens)
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 213, in _resize_token_embeddings
    new_embeddings = self._get_resized_embeddings(old_embeddings, new_num_tokens)
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/transformers/modeling_utils.py", line 234, in _get_resized_embeddings
    old_num_tokens, old_embedding_dim = old_embeddings.weight.size()
  File "/home/AD/rdsie/anaconda3/envs/lign251/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
    type(self).__name__, name))
AttributeError: 'AdaptiveEmbedding' object has no attribute 'weight'

It seems that the function resize_token_embeddings() does not currently account for the particulars of the input embeddings used for the TransformerXLLMHeadModel.

Expected behavior

We expect that resize_token_embeddings should handle the appropriate updating of the embedding layers for the new vocabulary size, so that the model can be correctly used with the new tokens.

Thank you in advance

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 2
Comments: 15 (11 by maintainers)

Most upvoted comments

Hi @vsieplus ,

This is a known bug and sadly we don’t have a solution for this now. TransfoXLLMHead uses adaptive weight embeddings which makes it not very easy to implement this function. Should be implemented in the long run though - I will note it down. @thomwolf @LysandreJik

patrickvonplaten on Mar 31, 2020

Thanks a lot @sgugger for answering here! As @sgugger mentioned, it’d be great if you can add a _resize_token_embeddings() function to TransfoXLPreTrainedModel.

The solution looks great to me @vsieplus 😃

You could make it a bit more compact, but that’s a nitpick:

    embeddings = self.get_input_embeddings()
    new_num_tokens_last = new_num_tokens - sum([emb.shape[0] for emb in embeddings.emb_layers[:-1])
    new_embeddings_last = self._get_resized_embeddings(embeddings.emb_layers[-1], new_num_tokens_last)
    embeddings.emb_layers[-1] = new_embeddings_last

    self.set_input_embeddings(embeddings)

patrickvonplaten on Jun 3, 2020

No it should not. XLNet uses the standard nn.embedding - so it should be fine.

patrickvonplaten on Apr 10, 2020