unsloth: Error name 'quantization' is not defined in unsloth_save_pretrained_gguf

Hi,

Error when trying to save pretrained model to GGUF as per your example notebook GGUF conversion, for q4 quant, (https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing#scrollTo=FqfebeAdT073) (which I’m running locally, not colab). Error as follows:

NameError                                 Traceback (most recent call last)
Cell In[21], [line 11](vscode-notebook-cell:?execution_count=21&line=11)
      [8](vscode-notebook-cell:?execution_count=21&line=8) if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "")
     [10](vscode-notebook-cell:?execution_count=21&line=10) # Save to q4_k_m GGUF
---> [11](vscode-notebook-cell:?execution_count=21&line=11) if True: model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
     [12](vscode-notebook-cell:?execution_count=21&line=12) if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "q4_k_m", token = "")

File [~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:665](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:665), in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, push_to_hub, token, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage)
    [662](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:662) for _ in range(3):
    [663](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:663)     gc.collect()
--> [665](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:665) file_location = save_to_gguf(new_save_directory, quantization, makefile)
    [667](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:667) # And save to HF
    [668](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22506f65227d.vscode-resource.vscode-cdn.net/mnt/samssd/developments/genai-playground/language/fine-tuning/~/.conda/envs/genaiplayground/lib/python3.9/site-packages/unsloth/save.py:668) if push_to_hub:

NameError: name 'quantization' is not defined

Please let me know if you need more info, thanks

About this issue

  • Original URL
  • State: closed
  • Created 5 months ago
  • Comments: 27 (16 by maintainers)

Most upvoted comments

@corticalstack OHHH ok ok!! Interesting - on that note, do you any other conversions? Some suggested AWQ and GPTQ πŸ˜ƒ

AWQ, GPTQ, and EXL2 for GPU inference

OH YAY!!! It works!!! πŸ˜ƒ))

Boom!

llama_model_quantize_internal: model size  =  2098.35 MB
llama_model_quantize_internal: quant size  =   636.18 MB

main: quantize time = 13103.27 ms
main:    total time = 13103.27 ms
Unsloth: Conversion completed! Output location: ./unsloth_outputs-unsloth.Q4_K_M.gguf

really amazing work, TY so much @danielhanchen

Also if you did merge_and_unload that is probably why it’s happening as well due to the old* transformers version.

Likewise, after reinstalling transformers to the latest - reload it in ur interactive session

@tmceld Oh that seems like ur Huggingface version is old - did you do merged_4bit? You can try upgrading HF to the latest if your goal is to upload a 4bit model: pip install --upgrade git+https://github.com/huggingface/transformers.git

But I suggest using merged_16bit for old transformers versions - actually I might patch save_pretrained to error out and say a better error message

@tmceld Oh my - how about this:

pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

@corticalstack Much apologies - my fault - I fixed this quickly as part of a hotfix. If you can update Unsloth only (no dependency updates) that would be awesome:

pip install --upgrade git+https://github.com/unslothai/unsloth.git

Sorry again!