transformers: GPT2 crashing at loss.backward()

Environment info

transformers version: 3.0.2
Platform: Ubuntu
Python version: 3.6
PyTorch version (GPU?): 1.5.0
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: Yes

Information

Trying to finetune GPT2 model but the GPU is crashing after loss.backward(). I thought it might be just my code but I ran some different code involving finetuning GPT2 and that as well crashed in the same manner.

Getting this warning as well.

WARNING - transformers.modeling_utils -   Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

A week or 2 back, everything was working fine but now the same code is crashing on loss.backward().

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 20 (10 by maintainers)

Most upvoted comments

Did you add any new tokens to the tokenizer ?

get the shape of embeddings using model.transformer.wte.weight.shape the first dim of shape and len of tokenizer should match. See if this asserts is True

assert modle.transformer.wte.weight.shape[0] == len(tokenizer)

if not then that means, your vocab size and embed input size are not matching. If you added new tokens to the vocab, you’ll need to resize the token embeddings of the model. You can resize it using

model.resize_token_embeddings(len(tokenizer))

patil-suraj on Aug 2, 2020