transformers: RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

Hi, I tried to add some other embeddings in your BertEmbedding source code and then load the pretrained weights ‘bert-base-chinese’. When I run the forward method, I got the issue ‘RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)’ Can someone help please? Thanks a lot

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 53
  • Comments: 19 (2 by maintainers)

Most upvoted comments

I think one of the possible reasons is that your padding token for token_type_id is out of range. Say you have four extra token_type_ids, then ’pad‘ , ‘cls’ and ‘unk’ may follow your tokenizer setting. BERT uses a large number for pad(100 something), then if your token_type_embedding is initialized to be only 4 class, it will result in similar error. So you might increase your token type vocabulary to consider special tokens and manually set them to 0,1,2 etc. Hope it helps.

I had not given my model the vocab size of my tokenizer when I initialized it, which gave me this error. Running the model on the CPU (as suggested here https://github.com/huggingface/transformers/issues/3090) gave me a better error message that let me figure this out, so that’s a more general tip if you get this error I guess.

I am getting the same error. I am trying to update the token_type_embeddings by having 4 types instead of 2.

model.config.type_vocab_size = 4
   
token_embed = nn.Embedding(model.config.type_vocab_size, model.config.hidden_size)

token_embed.weight.data.uniform_(-1,1)
     
model.bert.embeddings.token_type_embeddings = token_embed

@vdabravolski as for the tokenizer, I added special tokens and updated the length of the tokenizer and resized the model token_embeddings:

 tokenizer.add_special_tokens(SPECIAL_TOKENS_DICT) 
 
  model.resize_token_embeddings(len(tokenizer))

Try removing/deleting the cached .lock files and run again

In my case, I had to use device="cuda:8" to specify a GPU core other than the default 0.

I solved it by reducing batch _ size.

I had not given my model the vocab size of my tokenizer when I initialized it, which gave me this error. Running the model on the CPU (as suggested here #3090) gave me a better error message that let me figure this out, so that’s a more general tip if you get this error I guess.

This helped me solve my issue. I had initialized different versions of the from_pretrained with the tokenizer vs the model (e.g. from_pretrained('bert-large-uncased') and from_pretrained('bert-large-cased')).

Try removing/deleting the cached .lock files and run again

Thanks @manalabssas I’m getting the same issue. I try to delete all cache files, and it works. Thanks for your sharing.

Hello how did you delete all cache files ? I ma getting the same problem ?