llama.cpp: Failed to convert Llama-v2 models
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [Y] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [Y] I carefully followed the README.md.
- [Y] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [Y] I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Successful converting Llama models using the following command:
python convert.py models/xxx
where xxx is the original trained Llama model downloaded from Facebook.
Current Behavior
Cannot convert with errors (detailed below).
I’ve found there was an update in the following commit, after checkout an older version, the conversion could be done:
873637afc7924f435ac44c067630a28e82eefa7b
It seems that after the above commit, convert.py does not support the BPE vocab format anymore (–vocabtype param has been removed). While README did not reflect such change. This causes confusion.
Environment and Context
- Physical (or virtual) hardware you are using, e.g. for Linux:
Macbook M3 Max
- Operating System, e.g. for Linux:
MacOS - 14.1 (23B2073)
- SDK version, e.g. for Linux:
Python 3.10.13 transformers 4.36.1
Failure Information (for bugs)
Loading model file models/13B/consolidated.00.pth
Loading model file models/13B/consolidated.01.pth
params = Params(n_vocab=32000, n_embd=5120, n_layer=40, n_ctx=4096, n_ff=13824, n_head=40, n_head_kv=40, n_experts=None, n_experts_used=None, f_norm_eps=1e-05, rope_scaling_type=None, f_rope_freq_base=None, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=PosixPath('models/13B'))
Traceback (most recent call last):
File "llama.cpp/convert.py", line 1279, in <module>
main()
File "llama.cpp/convert.py", line 1255, in main
vocab = VocabLoader(params, vocab_dir)
File "llama.cpp/convert.py", line 342, in __init__
self.tokenizer = AutoTokenizer.from_pretrained(str(fname_tokenizer), trust_remote_code=True)
File "python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 752, in from_pretrained
config = AutoConfig.from_pretrained(
File "python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "python3.10/site-packages/transformers/configuration_utils.py", line 644, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
File "python3.10/site-packages/transformers/configuration_utils.py", line 699, in _get_config_dict
resolved_config_file = cached_file(
File "python3.10/site-packages/transformers/utils/hub.py", line 360, in cached_file
raise EnvironmentError(
OSError: models/13B does not appear to have a file named config.json. Checkout 'https://huggingface.co/models/13B/None' for available files.
About this issue
- Original URL
- State: closed
- Created 6 months ago
- Reactions: 13
- Comments: 28 (19 by maintainers)
As a temporary fix you can simply checkout an older commit, such as:
git checkout 0353a1840134b24b07ab61fd4490192f28c4db6b
This is the latest commit before this bug appears.
Between work, holidays, and helping out my cousins, I got sick over the holidays, so that’s why I went MIA. Spent the last few days fighting off the fever. Starting to feel better now. Wondering if any progress was made, if not, I can pick up where I left off. I’m skipping work over the weekend to recoup, so I’ll need something to keep me busy.
I did start on a new project named to_gguf which was supposed to isolate and reorganize a lot the client facing code. It was mostly experimental so I could play around with it without messing with upstream source code. Any progress I make there, I could push upstream if there was any interest in it. If not, it would be educational for me regardless.
SPM and BPE vocabularies were removed, so if you’re using a non-huggingface model, you’ll get this error. If you use facebooks hf model, it will work.
transformers.AutoTokenizer
is looking for the vocabulary and fails to find it because it’s looking for remote access for a file that doesn’t exist.This is simply due to the way the vocabularies are handled. I’ll be looking into it over the next few days to see if I can fix the regression.
It’s fine as it is since there is a workaround suggested. The old method created issues with other models where
n_vocab
cannot be deduced correctlythe conversion fails with the latest in master. Reverting to 0353a1840134b24b07ab61fd4490192f28c4db6b allows convert.py to progress.
the new backtrace with master is
@teleprint-me Thanks for looking into this. Pinging also @strutive07 for any extra insight on this
Same here @HighTemplar-wjiang. I found an alternative by reverting to commit f4d973cecb7368c985720ba9100ae6abba14806d and running the README instructions. It did not solve the problema but can give us some tips about what happened.
I attach an image below of the quantized model being tested