transformers: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`
Hi, I tried to add some other embeddings in your BertEmbedding source code and then load the pretrained weights ‘bert-base-chinese’.
When I run the forward method, I got the issue
‘RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)’
Can someone help please? Thanks a lot
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 53
- Comments: 19 (2 by maintainers)
I think one of the possible reasons is that your padding token for token_type_id is out of range. Say you have four extra token_type_ids, then ’pad‘ , ‘cls’ and ‘unk’ may follow your tokenizer setting. BERT uses a large number for pad(100 something), then if your token_type_embedding is initialized to be only 4 class, it will result in similar error. So you might increase your token type vocabulary to consider special tokens and manually set them to 0,1,2 etc. Hope it helps.
I had not given my model the vocab size of my tokenizer when I initialized it, which gave me this error. Running the model on the CPU (as suggested here https://github.com/huggingface/transformers/issues/3090) gave me a better error message that let me figure this out, so that’s a more general tip if you get this error I guess.
I am getting the same error. I am trying to update the token_type_embeddings by having 4 types instead of 2.
@vdabravolski as for the tokenizer, I added special tokens and updated the length of the tokenizer and resized the model token_embeddings:
Try removing/deleting the cached .lock files and run again
In my case, I had to use
device="cuda:8"to specify a GPU core other than the default0.I solved it by reducing batch _ size.
This helped me solve my issue. I had initialized different versions of the
from_pretrainedwith the tokenizer vs the model (e.g.from_pretrained('bert-large-uncased')andfrom_pretrained('bert-large-cased')).Hello how did you delete all cache files ? I ma getting the same problem ?