spaCy: Importing spacy (or thinc) breaks dot product
A very annoying bug that it took me forever to track down … I’m at a loss for what might be going on here.
How to reproduce the behaviour
import numpy as np
# works
small = np.random.randn(5, 6)
small.T @ small
# works
larger = np.random.randn(100, 110)
larger.T @ larger
import spacy # or import thinc
# works
small.T @ small
# works
larger + larger
# hangs forever, max CPU usage
larger.T @ larger # larger.T.dot(larger) also hangs
Your Environment
- spaCy version: 3.5.3 (also occurs under 3.6.0)
- Platform: macOS-13.4.1-x86_64-i386-64bit
- Python version: 3.11.4
- Pipelines: de_core_news_md (3.5.0), en_core_web_trf (3.5.0), en_core_web_sm (3.5.0), en_core_web_md (3.5.0)
I tried the same thing on another computer (Raspberry Pi) and it worked flawlessly, but on my Macbook, it hangs.
I can move this to the thinc github if you prefer.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (7 by maintainers)
It’s fine, thanks.
@svlandeg No need to apologize, you provide amazing software for free 😃 And I have a hunch this issue is very idiosyncratic - it is unlikely to affect everyone or even a large fraction of those using spacy on a Mac otherwise there’d be a lot more complaints.
Anyways to clarify:
This works:
This also terminates succesfully:
This however takes up all CPU for longer than I’m willing to tolerate with these temperatures without terminating:
I noticed this when trying to run
sklearn.metrics.pairwise.cosine_similarityon some spacy vectors.There may be some additional conditions I haven’t found yet.