BERTopic: TypeError: 'numpy.float64' object cannot be interpreted as an integer
Hey! I had the problem mentioned in this thread, but after the update the problem was solved. Another one appeared, I get this error:
TypeError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/bertopic/_bertopic.py](https://localhost:8080/#) in _cluster_embeddings(self, umap_embeddings, documents, partial_fit, y)
3217 try:
-> 3218 self.hdbscan_model.fit(umap_embeddings, y=y)
3219 except TypeError:
9 frames
hdbscan/_hdbscan_tree.pyx in hdbscan._hdbscan_tree.condense_tree()
hdbscan/_hdbscan_tree.pyx in hdbscan._hdbscan_tree.condense_tree()
TypeError: 'numpy.float64' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/hdbscan/hdbscan_.py](https://localhost:8080/#) in _tree_to_labels(X, single_linkage_tree, min_cluster_size, cluster_selection_method, allow_single_cluster, match_reference_implementation, cluster_selection_epsilon, max_cluster_size)
76 set of labels and probabilities.
77 """
---> 78 condensed_tree = condense_tree(single_linkage_tree, min_cluster_size)
79 stability_dict = compute_stability(condensed_tree)
80 labels, probabilities, stabilities = get_clusters(
hdbscan/_hdbscan_tree.pyx in hdbscan._hdbscan_tree.condense_tree()
hdbscan/_hdbscan_tree.pyx in hdbscan._hdbscan_tree.condense_tree()
TypeError: 'numpy.float64' object cannot be interpreted as an integer
I checked the data that I put into the model and there is nothing there in this format. What’s more, I checked it on the file I was working on yesterday, which was reprocessed successfully. Suddenly I am getting this error on it as well. Any ideas?
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 10
- Comments: 19 (4 by maintainers)
@ssaee79 The following is working for me in a fresh Google Colab:
First, you install BERTopic as follows:
Then, you restart the runtime to make sure that imports are refreshed.
Finally, the following code is working for me:
This is a problem with hdbscan, not BERTopic, and can be worked around with this method: https://github.com/scikit-learn-contrib/hdbscan/issues/600#issuecomment-1638837464
It’s working now, but I’m not entirely sure what fixed it. I already had MV C++ installed. I had been creating new environments, restarting my computer, etc. I assume one of the fixes here worked, so thank you! Sorry I can’t point to one in particular.
It works for me 😃 Thank you so much!
are you solve it? even I use !pip install git+https://github.com/scikit-learn-contrib/hdbscan.git !pip install BERTopic still not working
hi, how did you resolve it? i’ve been having the same issue and now created multiple environments but no luck
@Rishi-Prakash-TS Did you try from a completely new environment? It often helps to start fresh and then do the installation of packages.
There is a new release of hdbscan on PyPI that will hopefully fix this now.
Thanks too!