spaCy: ValueError: shapes (4,0) and (300,128) not aligned from numpy at Thinc

I need to handle German and English languages with a single application. It worked fine with spaCy 1.8.2, 1.9.0 and 1.10.0 but gets broken with spaCy 2.0.3.

To reproduce the issue:

 >>> import spacy
 >>> nlpEN = spacy.load('en')
 >>> nlpDE = spacy.load('de')
 >>> doc = nlpEN('Hello world!')

The error messages:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Projects/foobar/.env/lib/python3.6/site-packages/spacy/language.py", line 333, in __call__
    doc = proc(doc)
  File "pipeline.pyx", line 390, in spacy.pipeline.Tagger.__call__
  File "pipeline.pyx", line 402, in spacy.pipeline.Tagger.predict
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
    X = layer(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 293, in predict
    X = layer(layer.ops.flatten(seqs_in, pad=pad))
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
    X = layer(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
    return self.predict(x)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 125, in predict
    y, _ = self.begin_update(X)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 372, in uniqued_fwd
    Y_uniq, bp_Y_uniq = layer.begin_update(X[ind], drop=drop)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 61, in begin_update
    X, inc_layer_grad = layer.begin_update(X, drop=drop)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
    output = func(*args, **kwargs)
  File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/static_vectors.py", line 67, in begin_update
    dotted = self.ops.batch_dot(vectors, self.W)
  File "ops.pyx", line 299, in thinc.neural.ops.NumpyOps.batch_dot
ValueError: shapes (4,0) and (300,128) not aligned: 0 (dim 1) != 300 (dim 0)

Info about spaCy

  • spaCy version: 2.0.3
  • Platform: Darwin-17.2.0-x86_64-i386-64bit
  • Python version: 3.6.3
  • Models: en_core_web_md, fr_core_news_md, it, it_core_news_sm, de_core_news_sm, de, fr, en
  • en_core_web_md version: 2.0.0
  • de_core_news_sm version: 2.0.0

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 15 (4 by maintainers)

Commits related to this issue

Most upvoted comments

I think this comes down to an ill-considered use of a global variable when using pre-trained models in Thinc. The global variable is used to avoid storing extra copies of the vectors data. However, I think it’s not keyed correctly by the spaCy model — causing this error when there are multiple language models in memory.

I expect to get to this bug before the end of the week – thanks for your patience; and thanks for reporting.

Has anybody found even a temporary fix to this?

This does not only happen with pre-trained models, this is happening for me with all custom models