spaCy: Error when GPU initialization and model loading are in different notebook cells

How to reproduce the behaviour

Installed the latest version following the instruction

pip install -U pip setuptools wheel
pip install -U spacy[cuda110]
python -m spacy download en_core_web_trf

The installation was successful, and gpu is available:

import spacy
from thinc.api import set_gpu_allocator, require_gpu
import spacy_transformers
print(spacy.prefer_gpu(),require_gpu(0))
True True

However, Error: “a bytes-like object is required, not ‘cupy.core.core.ndarray’” happened, when using “en_core_web_trf”.

nlp = spacy.load("en_core_web_trf")
sent = 'Hello World'
doc = nlp(sent)
TypeError                                 Traceback (most recent call last)
<ipython-input-12-66e94dc9d1fd> in <module>
      1 sent = 'Hello World'
----> 2 doc = nlp(sent)


~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/language.py in __call__(self, text, disable, component_cfg)
    992                 raise ValueError(Errors.E109.format(name=name)) from e
    993             except Exception as e:
--> 994                 error_handler(name, proc, [doc], e)
    995             if doc is None:
    996                 raise ValueError(Errors.E005.format(name=name))

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/util.py in raise_error(proc_name, proc, docs, e)
   1493 
   1494 def raise_error(proc_name, proc, docs, e):
-> 1495     raise e
   1496 
   1497 

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/language.py in __call__(self, text, disable, component_cfg)
    987                 error_handler = proc.get_error_handler()
    988             try:
--> 989                 doc = proc(doc, **component_cfg.get(name, {}))
    990             except KeyError as e:
    991                 # This typically happens if a component is not initialized

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/pipeline/trainable_pipe.pyx in spacy.pipeline.trainable_pipe.TrainablePipe.__call__()

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/util.py in raise_error(proc_name, proc, docs, e)
   1493 
   1494 def raise_error(proc_name, proc, docs, e):
-> 1495     raise e
   1496 
   1497 

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/pipeline/trainable_pipe.pyx in spacy.pipeline.trainable_pipe.TrainablePipe.__call__()

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy/pipeline/tagger.pyx in spacy.pipeline.tagger.Tagger.predict()

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/model.py in predict(self, X)
    310         only the output, instead of the `(output, callback)` tuple.
    311         """
--> 312         return self._func(self, X, is_train=False)[0]
    313 
    314     def finish_update(self, optimizer: Optimizer) -> None:

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/layers/chain.py in forward(model, X, is_train)
     52     callbacks = []
     53     for layer in model.layers:
---> 54         Y, inc_layer_grad = layer(X, is_train=is_train)
     55         callbacks.append(inc_layer_grad)
     56         X = Y

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/model.py in __call__(self, X, is_train)
    286         """Call the model's `forward` function, returning the output and a
    287         callback to compute the gradients via backpropagation."""
--> 288         return self._func(self, X, is_train=is_train)
    289 
    290     def initialize(self, X: Optional[InT] = None, Y: Optional[OutT] = None) -> "Model":

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/layers/chain.py in forward(model, X, is_train)
     52     callbacks = []
     53     for layer in model.layers:
---> 54         Y, inc_layer_grad = layer(X, is_train=is_train)
     55         callbacks.append(inc_layer_grad)
     56         X = Y

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/model.py in __call__(self, X, is_train)
    286         """Call the model's `forward` function, returning the output and a
    287         callback to compute the gradients via backpropagation."""
--> 288         return self._func(self, X, is_train=is_train)
    289 
    290     def initialize(self, X: Optional[InT] = None, Y: Optional[OutT] = None) -> "Model":

~/anaconda3/envs/acl/lib/python3.8/site-packages/spacy_transformers/layers/trfs2arrays.py in forward(model, trf_datas, is_train)
     26             src = model.ops.reshape2f(trf_data.tensors[t_i], -1, trf_data.width)
     27             dst, get_d_src = apply_alignment(model.ops, trf_data.align, src)
---> 28             output, get_d_dst = pooling(dst, is_train)
     29             outputs.append(output)
     30             backprops.append((get_d_dst, get_d_src))

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/model.py in __call__(self, X, is_train)
    286         """Call the model's `forward` function, returning the output and a
    287         callback to compute the gradients via backpropagation."""
--> 288         return self._func(self, X, is_train=is_train)
    289 
    290     def initialize(self, X: Optional[InT] = None, Y: Optional[OutT] = None) -> "Model":

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/layers/reduce_mean.py in forward(model, Xr, is_train)
     16 
     17 def forward(model: Model[InT, OutT], Xr: InT, is_train: bool) -> Tuple[OutT, Callable]:
---> 18     Y = model.ops.reduce_mean(cast(Floats2d, Xr.data), Xr.lengths)
     19     lengths = Xr.lengths
     20 

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/backends/numpy_ops.pyx in thinc.backends.numpy_ops.NumpyOps.reduce_mean()

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/backends/numpy_ops.cpython-38-x86_64-linux-gnu.so in View.MemoryView.memoryview_cwrapper()

~/anaconda3/envs/acl/lib/python3.8/site-packages/thinc/backends/numpy_ops.cpython-38-x86_64-linux-gnu.so in View.MemoryView.memoryview.__cinit__()

TypeError: a bytes-like object is required, not 'cupy.core.core.ndarray'

When I try en_core_web_sm (3.0.0) and en_core_web_lg (3.0.0), there is no problem at all.

Your Environment

Info about spaCy

  • spaCy version: 3.0.1
  • Platform: Linux-4.19.128-microsoft-standard-x86_64-with-glibc2.10
  • Python version: 3.8.5
  • Pipelines: en_core_web_trf (3.0.0), en_core_web_sm (3.0.0), en_core_web_lg (3.0.0)
  • Operating System: Win10, WSL2
  • Python Version Used: Python 3.8.5
  • spaCy Version Used: 3.0.1
  • Environment Information: Virtual Environment

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 38 (8 by maintainers)

Commits related to this issue

Most upvoted comments

@adrianeboyd I did some experiments on jupyter notebook that might help you narrow down and replicate the issue. The error only raises when the following two lines are NOT in the same cell of the notebook.

  • spacy.require_gpu() / spacy.prefer_gpu()
  • nlp = spacy.load("en_core_web_trf")

I don’t know how, but you are correct. Moving spacy load to the same cell works.

is_using_gpu = spacy.prefer_gpu()
if is_using_gpu:
    torch.set_default_tensor_type("torch.cuda.FloatTensor")
    
nlp = spacy.load("en_core_web_trf")

Even with the non-pytorch models, having require_gpu in one cell and spacy.load in another can result in the model being loaded on CPU instead of GPU, so it’s not a pytorch-only issue. The plain spacy model loaded on CPU doesn’t crash, but it’s still not what the user intended.

The pytorch GPU setting is stored differently and preserved across cells, so you end up loading the spacy parts of the model on CPU and the torch parts on GPU and it doesn’t work together, which leads to the error in the original report.

Can confirm as well that @wayne315315 's suggestion works.

Can confirm as well that @wayne315315’s suggestion works.

I was facing this issue despite having my code within a .py file, and within a class, where the first line of init() function was spacy.require_gpu(0)

Fine-tuning worked like a charm, but prediction threw me this error!

So, taking the hint from running spacy.require_gpu(0) within the same jupyter cell, I explicitly wrote this line in predict function where I load saved models, and it started working fine!

Taadaaaaa!!!

Thanks to @svlandeg and everyone else for sharing clues around this!

I think this is the underlying reason: https://github.com/ipython/ipython/issues/11565

as Thinc stores the correct “Ops” in a ContextVar:

image

@bitmman I can confirm the same With spacy[cuda100]==3.0.1, en_core_web_trf=3.0.0, I had the same error of ‘en_core_web_trf’ model on jupyter notebook. But, identical code ran smoothly on the terminal.

I can’t test any of this locally so this is a bit tedious, sorry. CUDA 11.0 support shouldn’t be the issue. If en_core_web_lg is working fine with GPU, then spacy and cupy are probably fine. Someone else is reporting the same error in #6971 with a conda install of torch (the next thing I was going to suggest), so I am not sure what’s going on and I can’t replicate this with a very similar setup with conda in linux, so it’s hard to debug.

What is the output of conda list?

Is it possible for you try this in a pip-only venv?

Hi, my previous code was running in jupyter notebook. I just ran exactly the same code in the terminal both in conda venv and pip-only venv and there are no problems at all. So, I guess this problem only occurs when run “en_core_web_trf” in jupyter notebook?

Thanks.