spaCy: Importing spacy (or thinc) breaks dot product

A very annoying bug that it took me forever to track down … I’m at a loss for what might be going on here.

How to reproduce the behaviour

import numpy as np

# works
small = np.random.randn(5, 6)
small.T @ small

# works
larger = np.random.randn(100, 110)
larger.T @ larger

import spacy  # or import thinc

# works
small.T @ small

# works
larger + larger

# hangs forever, max CPU usage
larger.T @ larger  # larger.T.dot(larger) also hangs

Your Environment

  • spaCy version: 3.5.3 (also occurs under 3.6.0)
  • Platform: macOS-13.4.1-x86_64-i386-64bit
  • Python version: 3.11.4
  • Pipelines: de_core_news_md (3.5.0), en_core_web_trf (3.5.0), en_core_web_sm (3.5.0), en_core_web_md (3.5.0)

I tried the same thing on another computer (Raspberry Pi) and it worked flawlessly, but on my Macbook, it hangs.

I can move this to the thinc github if you prefer.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (7 by maintainers)

Most upvoted comments

It’s fine, thanks.

@svlandeg No need to apologize, you provide amazing software for free 😃 And I have a hunch this issue is very idiosyncratic - it is unlikely to affect everyone or even a large fraction of those using spacy on a Mac otherwise there’d be a lot more complaints.

Anyways to clarify:

This works:

import numpy as np
larger = np.random.randn(100, 110)
larger.T @ larger

This also terminates succesfully:

import numpy as np
import spacy

small = np.random.randn(5, 6)
small.T @ small

This however takes up all CPU for longer than I’m willing to tolerate with these temperatures without terminating:

import numpy as np
import spacy

larger = np.random.randn(100, 110)
larger.T @ larger

I noticed this when trying to run sklearn.metrics.pairwise.cosine_similarity on some spacy vectors.

There may be some additional conditions I haven’t found yet.