tokenizers: Tokenizer import error

I run my experiment today, but I am getting msg error saying that some classes from tokenizers are not found.

ImportError: cannot import name ‘BertWordPieceTokenizer’

I am using the standard import.

from tokenizers import (ByteLevelBPETokenizer, BPETokenizer, SentencePieceBPETokenizer, BertWordPieceTokenizer)

The code was running perfectly last friday. I saw you released a new transformers version. Did it change tokenizers?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (3 by maintainers)

Most upvoted comments

After you install transformers, you need to unistall tokenizers and install again tokenizers.

I am using Colab, so I am doing this workaroung until they release the new version:

!pip install transformers !pip uninstall tokenizers !pip install tokenizers

Hope it helps you!

Yes, for now only the 0.0.11 version of tokenizers is compatible with transformers. We are working on an update that will handle newer versions. So if you need both libraries, you will need to stick to the 0.0.11 (unless you don’t use it through transformers). In this case, please uninstall tokenizers entirely before re-installing the target version, so that the file does not get overwritten by pip.