ctransformers: transformers 4.34 caused NotImplementedError when calling CTransformersTokenizer(PreTrainedTokenizer)
transformers version: pip install transformers==4.34.0
ctransformersversion: pip install ctransformers==0.2.27
I encounter the following error
File ".venv\lib\site-packages\ctransformers\transformers.py", line 84, in __init__kages\ctransformers\transformers.py", line 84, in __init__
super().__init__(**kwargs)
File ".venv\lib\site-packages\transformers\tokenization_utils.py", line 366, in __init__
self._add_tokens(self.all_special_tokens_extended, special_tokens=True)
File ".venv\lib\site-packages\transformers\tokenization_utils.py", line 462, in _add_tokens
current_vocab = self.get_vocab().copy()
File ".venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1715, in ``get_vocab
raise NotImplementedError()``
NotImplementedError
transformers has PreTrainedTokenizer in tokenization_utils.py code change (2da8853) where _add_tokens on line 454 current_vocab = self.get_vocab().copy().
PreTrainedTokenizer itself has added_tokens_decoder and __len__ implemented, so only get_vocab would cause NotImplementedError()
About this issue
- Original URL
- State: open
- Created 9 months ago
- Reactions: 2
- Comments: 17
Commits related to this issue
- downgrade transformers see https://github.com/marella/ctransformers/issues/154 — committed to caretech-owl/gerd by aleneum 8 months ago
- downgrade transformers see https://github.com/marella/ctransformers/issues/154 — committed to caretech-owl/gerd by aleneum 8 months ago
ok, I quickly write this up and it works fine (you will need transformers==4.34.0 then build ctransformers from #155 and install)
Hi @victorlee0505 . I’ve rebuilt with PR https://github.com/marella/ctransformers/pull/155 and can confirm, the
NotImplementedErroris gone. Thanks!make sure to run
export CT_CUBLAS=ONbeforepython setup.py sdistotherwise it won’t build the cuda support.you might also need to setup these two in your bashrc and confirm the
nvccversion matchesnvidia-smiSo i get x15 faster token output by having no gpu layers… I think something is wrong