BERTopic: Github actions: ValueError: numpy.ndarray size changed, may indicate binary incompatibility.
The github actions workflow is suddenly giving me the following error:
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
It seems that it has most likely to do with numpy-based binary compatibility issues (some more info here). However, I cannot seem to fix it thus far with the suggested method (setting oldest-supported-numpy
in pyproject.toml
).
If you have any idea, please follow along with the full discussions here. Any help is greatly appreciated!
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (11 by maintainers)
The last few days I have been bug-fixing this as much as I could. However, it seems that the issue stems from ABI issues between HDBSCAN and Numpy. Whenever a major version is released from Numpy, there is a chance that it will break HDBSCAN if used together with UMAP.
Python 3.7
BERTopic works in python 3.7 seemingly without any problems, simply
pip install bertopic
should work.Python 3.8+
For now, if you are on Python 3.8 or higher, it seems that the following will work:
Future Fix
At this point, I am not entirely sure how I want to proceed. It seems that
numpy>1.20.3
may introduce issues with large datasets on python 3.8+ as UMAP and HDBSCAN do not work properly together in that case. Thus, there does not seem to be a solid fix for now unless HDBSCAN gets updated to prevent this from happening in the future.Having said that, any and all help is greatly appreciated!
Okay so I downgraded Python to 3.7 and now it works. I’m still not sure why it doesnt work with 3.8
Using conda to install bertopic worked for me.
Conda
To those interested, some of the issues users are having with the installation of BERTopic might be resolved by using conda to install BERTopic.
Installing
bertopic
from theconda-forge
channel can be achieved by addingconda-forge
to your channels with:Once the
conda-forge
channel has been enabled,bertopic
can be installed with:@Ariannaperla Most likely, you updated to an unsupported numpy or numba version. I would advise starting from a fresh environment and trying the above again. If that does not work, using python 3.7 might solve your issue.
If all fails, you can also install BERTopic from conda, as instructed here.