spaCy: ValueError: shapes (4,0) and (300,128) not aligned from numpy at Thinc
I need to handle German and English languages with a single application. It worked fine with spaCy 1.8.2, 1.9.0 and 1.10.0 but gets broken with spaCy 2.0.3.
To reproduce the issue:
>>> import spacy
>>> nlpEN = spacy.load('en')
>>> nlpDE = spacy.load('de')
>>> doc = nlpEN('Hello world!')
The error messages:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Projects/foobar/.env/lib/python3.6/site-packages/spacy/language.py", line 333, in __call__
doc = proc(doc)
File "pipeline.pyx", line 390, in spacy.pipeline.Tagger.__call__
File "pipeline.pyx", line 402, in spacy.pipeline.Tagger.predict
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
X = layer(X)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 293, in predict
X = layer(layer.ops.flatten(seqs_in, pad=pad))
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 55, in predict
X = layer(X)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 161, in __call__
return self.predict(x)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 125, in predict
y, _ = self.begin_update(X)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 372, in uniqued_fwd
Y_uniq, bp_Y_uniq = layer.begin_update(X[ind], drop=drop)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 61, in begin_update
X, inc_layer_grad = layer.begin_update(X, drop=drop)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
output = func(*args, **kwargs)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
output = func(*args, **kwargs)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
output = func(*args, **kwargs)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in begin_update
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 176, in <listcomp>
values = [fwd(X, *a, **k) for fwd in forward]
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/api.py", line 258, in wrap
output = func(*args, **kwargs)
File "/Projects/foobar/.env/lib/python3.6/site-packages/thinc/neural/_classes/static_vectors.py", line 67, in begin_update
dotted = self.ops.batch_dot(vectors, self.W)
File "ops.pyx", line 299, in thinc.neural.ops.NumpyOps.batch_dot
ValueError: shapes (4,0) and (300,128) not aligned: 0 (dim 1) != 300 (dim 0)
Info about spaCy
- spaCy version: 2.0.3
- Platform: Darwin-17.2.0-x86_64-i386-64bit
- Python version: 3.6.3
- Models: en_core_web_md, fr_core_news_md, it, it_core_news_sm, de_core_news_sm, de, fr, en
- en_core_web_md version: 2.0.0
- de_core_news_sm version: 2.0.0
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (4 by maintainers)
Commits related to this issue
- Pre-annotate test German sentence due to explosion/spaCy#1660 — committed to danielhers/tupa by danielhers 6 years ago
- Fix loading of multiple pre-trained vectors This patch addresses #1660, which was caused by keying all pre-trained vectors with the same ID when telling Thinc how to refer to them. This meant that if... — committed to explosion/spaCy by honnibal 6 years ago
- Add test for #1660 — committed to explosion/spaCy by honnibal 6 years ago
I think this comes down to an ill-considered use of a global variable when using pre-trained models in Thinc. The global variable is used to avoid storing extra copies of the vectors data. However, I think it’s not keyed correctly by the spaCy model — causing this error when there are multiple language models in memory.
I expect to get to this bug before the end of the week – thanks for your patience; and thanks for reporting.
Has anybody found even a temporary fix to this?
This does not only happen with pre-trained models, this is happening for me with all custom models