spaCy: Spacy-LLM fails with storage not allocated on MPS device

Hi,

The code example listed below fails with the following error:

RuntimeError: Placeholder storage has not been allocated on MPS device!

I’m running it on a MacBook Air with Apple Silicon (M2, 2022) under macOS Monterey (Version 12.6). Additional details below.

The full traceback is listed below the code.

Note: This is a continuation of Issue #12987 (Unknown function registry: ‘llm_backends’).

How to reproduce the behaviour

Here is the code, based on the example provided in Matthew Honnibal’s blog “Against LLM maximalism” (https://explosion.ai/blog/against-llm-maximalism):

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("sentencizer")
nlp.add_pipe(
    "llm",
    config={
        "task": {
            "@llm_tasks": "spacy.NER.v1",
            "labels": "SAAS_PLATFORM,PROGRAMMING_LANGUAGE,OPEN_SOURCE_LIBRARY"
        },
        "model": {
            "@llm_models": "spacy.OpenLLaMA.v1",
            "name": "open_llama_3b"
        },
    },
)

doc = nlp("There's no PyTorch bindings for Go. We just use Microsoft Cognitive Services.")
for ent in doc.ents:
    print(ent.text, ent.label_, ent.sent)

Here is the traceback:

File “/Users/ron.katriel/PycharmProjects/Transformer/test-spacy-llm.py”, line 19, in <module> doc = nlp(“There’s no PyTorch bindings for Go. We just use Microsoft Cognitive Services.”) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py”, line 1054, in call error_handler(name, proc, [doc], e) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/util.py”, line 1704, in raise_error raise e File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy/language.py”, line 1049, in call doc = proc(doc, **component_cfg.get(name, {})) # type: ignore[call-arg] File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py”, line 156, in call docs = self._process_docs([doc]) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/pipeline/llm.py”, line 210, in _process_docs responses_iters = tee(self._model(prompts_iters[0]), n_iters) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/models/hf/openllama.py”, line 55, in call return [ File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/spacy_llm/models/hf/openllama.py”, line 57, in <listcomp> self._model.generate(input_ids=tii, **self._config_run)[ File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py”, line 115, in decorate_context return func(*args, **kwargs) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/generation/utils.py”, line 1606, in generate return self.greedy_search( File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/generation/utils.py”, line 2454, in greedy_search outputs = self( File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1501, in _call_impl return forward_call(*args, **kwargs) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py”, line 1038, in forward outputs = self.model( File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1501, in _call_impl return forward_call(*args, **kwargs) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py”, line 875, in forward inputs_embeds = self.embed_tokens(input_ids) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1501, in _call_impl return forward_call(*args, **kwargs) File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/sparse.py”, line 162, in forward return F.embedding( File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/functional.py”, line 2210, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Placeholder storage has not been allocated on MPS device!

Your Environment

Platform: macOS-12.6-arm64-arm-64bit
Python Version: 3.11.4
spaCy Version: 3.6.1

About this issue

Original URL
State: closed
Created 8 months ago
Comments: 18 (10 by maintainers)

Most upvoted comments

Thanks, that did the trick! Just the following warning which I have been getting all along

UserWarning: Couldn't find a CUDA GPU, so the setting 'device_map:auto' will be used, which may result in the LLM being loaded (partly) on the CPU or even the hard disk, which may be slow. Install cuda to be able to load and run the LLM on the GPU instead.

However, no entities were found by the code example listed above, which is unexpected.

rkatriel on Nov 13, 2023