accelerate: ValueError: weight is on the meta device, we need a `value` to put in on cpu.

System Info

Windows 10
Accelerate Version: from git (recent)
Python 3.8.0
4GB GPU
16GB RAM

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • One of the scripts in the examples/ folder of Accelerate or an officially supported no_trainer script in the examples folder of the transformers repo (such as run_no_trainer_glue.py)
  • My own task or dataset (give details below)

Reproduction

I am using:

BASE_MODEL = “decapoda-research/llama-7b-hf” LORA_WEIGHTS = “tloen/alpaca-lora-7b”

I get this error: ValueError: weight is on the meta device, we need a value to put in on cpu. in modeling.py, function set_module_tensor_to_device:

    if old_value.device == torch.device("meta") and device not in ["meta", torch.device("meta")] and value is None:
        raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")

More details:

  • I am trying to load the model on my GPU 4GB. So I am low on GPU resources and I suppose a lot offloading is performed back and forward between the CPU and the GPU.
  • In my code there is model.half(), but this gives me an error RuntimeError: "addmm_impl_cpu_" not implemented for 'Half', so I disabled this code in primary script
  • So my code goes further (after not using Half()), but it fails with the error above about the meta device.

The source code is available: here

Might be related: https://github.com/huggingface/accelerate/issues/1197

Expected behavior

No error. The model and the weights are loaded (in both CPU and GPU).

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 22

Most upvoted comments

I installed accelerate yesterday from GIT. I forgot to say. It all works OK in Google Colab. Google Colab has a 16 GB GPU and the model is loaded OK. I use weights not from Meta, but from Alpaca Stanford. It does not work on my laptop with 4GB GPU when I insist on using the GPU. In CPU mode it also works on my laptop, but it takes between 20 and 40 minutes to get an answer to a prompt. So when I insist on using my 4GB GPU it fails somewhere in the process of putting back and forward the model between the GPU and the CPU (the two types of RAM). I do not understand the above error message very well. What is a “meta” device for example?

@philip30 This is because the initalization under init_empty_weights breaks the tied weights. You need to add a model.tie_weights() to re-tie them afterward.:

whisper_model = "openai/whisper-tiny"
weights_location = hf_hub_download(whisper_model, 'pytorch_model.bin')
config = AutoConfig.from_pretrained(whisper_model)
with init_empty_weights():
     model = AutoModelWithLMHead.from_config(config)
model.tie_weights()
model = load_checkpoint_and_dispatch(model, weights_location, device_map='auto')

This is in the documentation

If you are not offloading anything (e.g. the device map only contains GPUs), it works for training as well.