transformers: GPT2 crashing at loss.backward()
Environment info
transformers
version: 3.0.2- Platform: Ubuntu
- Python version: 3.6
- PyTorch version (GPU?): 1.5.0
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Yes
Information
Trying to finetune GPT2 model but the GPU is crashing after loss.backward()
. I thought it might be just my code but I ran some different code involving finetuning GPT2 and that as well crashed in the same manner.
Getting this warning as well.
WARNING - transformers.modeling_utils - Some weights of GPT2LMHeadModel were not initialized from the model checkpoint at gpt2 and are newly initialized: ['h.0.attn.masked_bias', 'h.1.attn.masked_bias', 'h.2.attn.masked_bias', 'h.3.attn.masked_bias', 'h.4.attn.masked_bias', 'h.5.attn.masked_bias', 'h.6.attn.masked_bias', 'h.7.attn.masked_bias', 'h.8.attn.masked_bias', 'h.9.attn.masked_bias', 'h.10.attn.masked_bias', 'h.11.attn.masked_bias', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
A week or 2 back, everything was working fine but now the same code is crashing on loss.backward()
.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (10 by maintainers)
Did you add any new tokens to the tokenizer ?
get the shape of embeddings using
model.transformer.wte.weight.shape
the first dim of shape and len of tokenizer should match. See if this asserts is Trueif not then that means, your vocab size and embed input size are not matching. If you added new tokens to the vocab, you’ll need to resize the token embeddings of the model. You can resize it using