Nuitka: Tokenizer requires the SentencePiece library but it was not found in yo_sentencepiece_objects.py", line 93, in __init__ur environment. Checkout the instructions on the
When using: nuitka 1.7.9 transformers==4.29.2 This works perfectly fine, it is able to find the sentence piece library and tokenizer and tokenize.
When using nuitka 1.7.9 transformers==4.31.0
LlamaTokenizer requires the SentencePiece library but it was not found in yo_sentencepiece_objects.py", line 93, in __init__ur environment. Checkout the instructions on the t_utils.py", line 1027, in requires_backends installation page of its repo: https://github.com/google/sentencepiece#installation and follow the ones ur environment. Checkout the instructions on the that match your environment. Please note that you may need to restart your rllation and follow the onesuntime after installation.
Here is an example
from transformers import LlamaTokenizer
import logging
logging.basicConfig(
format="%(asctime)s %(levelname)s [%(name)s] %(message)s", level=logging.INFO, datefmt="%Y-%m-%d %H:%M:%S"
)
vocab_file = r"C:\Users\Tensor\Desktop\TestSentencePieceHF\Pygmalion-7b\tokenizer.model"
tokenizer_file = r"C:\Users\Tensor\Desktop\TestSentencePieceHF\Pygmalion-7b\tokenizer.json"
tokenizer = LlamaTokenizer(vocab_file=vocab_file,tokenizer_file=tokenizer_file,use_fast=True)
examples = [
tokenizer(
"auto-gptq is an easy-to-use model quantization library with user-friendly apis, based on GPTQ algorithm."
)
]
print(tokenizer("auto_gptq is", return_tensors="pt"))
This is the model files to be loaded C:\Users\Tensor\Desktop\TestSentencePiece\Pygmalion-7b-4bit-GPTQ-Safetensors https://drive.google.com/file/d/10ZD89g7BVpjbmnkrzIEirxvNzoLwdxOX/view?usp=sharing
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 16 (10 by maintainers)
So, transformers has a bunch of meta data dependencies we do not yet list, so
--include-distribution-metadatacan likely solve this, but I am doing two things for this. First, in case of missing metadata, it’s going to give an error that suggests the options. And second, I am going to add a bunch of these from the source, but it’s likely a moving target, but I don’t want code to extract these.I am having this issue as well. Would be thankful for a solution!
Absolutely! just surfacing these things so it doesn’t get lost 😃 thank you again!