scikit-learn: kMeans stopped working with numpy 1.22.2

Describe the bug

kMeans is not working anymore with numpy 1.22.2 Probably similiar to (https://github.com/scikit-learn/scikit-learn/issues/22683) but not sure if it is the same fix

Steps/Code to Reproduce

allLocations = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
kmeanModel = KMeans(n_clusters=k, random_state=0)
kmeanModel.fit(allLocations)

Expected Results

Some fitted model

Actual Results

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-30-db8e8220c8b9> in <module>
     12 for k in K:
     13     kmeanModel = KMeans(n_clusters=k, random_state=0)
---> 14     kmeanModel.fit(allLocations)
     15     distortions.append(kmeanModel.inertia_)
     16 #Plotting the distortions

~\anaconda3\lib\site-packages\sklearn\cluster\_kmeans.py in fit(self, X, y, sample_weight)
   1169         if self._algorithm == "full":
   1170             kmeans_single = _kmeans_single_lloyd
-> 1171             self._check_mkl_vcomp(X, X.shape[0])
   1172         else:
   1173             kmeans_single = _kmeans_single_elkan

~\anaconda3\lib\site-packages\sklearn\cluster\_kmeans.py in _check_mkl_vcomp(self, X, n_samples)
   1026         active_threads = int(np.ceil(n_samples / CHUNK_SIZE))
   1027         if active_threads < self._n_threads:
-> 1028             modules = threadpool_info()
   1029             has_vcomp = "vcomp" in [module["prefix"] for module in modules]
   1030             has_mkl = ("mkl", "intel") in [

~\anaconda3\lib\site-packages\sklearn\utils\fixes.py in threadpool_info()
    323         return controller.info()
    324     else:
--> 325         return threadpoolctl.threadpool_info()
    326 
    327 

~\anaconda3\lib\site-packages\threadpoolctl.py in threadpool_info()
    122     In addition, each module may contain internal_api specific entries.
    123     """
--> 124     return _ThreadpoolInfo(user_api=_ALL_USER_APIS).todicts()
    125 
    126 

~\anaconda3\lib\site-packages\threadpoolctl.py in __init__(self, user_api, prefixes, modules)
    338 
    339             self.modules = []
--> 340             self._load_modules()
    341             self._warn_if_incompatible_openmp()
    342         else:

~\anaconda3\lib\site-packages\threadpoolctl.py in _load_modules(self)
    371             self._find_modules_with_dyld()
    372         elif sys.platform == "win32":
--> 373             self._find_modules_with_enum_process_module_ex()
    374         else:
    375             self._find_modules_with_dl_iterate_phdr()

~\anaconda3\lib\site-packages\threadpoolctl.py in _find_modules_with_enum_process_module_ex(self)
    483 
    484                 # Store the module if it is supported and selected
--> 485                 self._make_module_from_path(filepath)
    486         finally:
    487             kernel_32.CloseHandle(h_process)

~\anaconda3\lib\site-packages\threadpoolctl.py in _make_module_from_path(self, filepath)
    513             if prefix in self.prefixes or user_api in self.user_api:
    514                 module_class = globals()[module_class]
--> 515                 module = module_class(filepath, prefix, user_api, internal_api)
    516                 self.modules.append(module)
    517 

~\anaconda3\lib\site-packages\threadpoolctl.py in __init__(self, filepath, prefix, user_api, internal_api)
    604         self.internal_api = internal_api
    605         self._dynlib = ctypes.CDLL(filepath, mode=_RTLD_NOLOAD)
--> 606         self.version = self.get_version()
    607         self.num_threads = self.get_num_threads()
    608         self._get_extra_info()

~\anaconda3\lib\site-packages\threadpoolctl.py in get_version(self)
    644                              lambda: None)
    645         get_config.restype = ctypes.c_char_p
--> 646         config = get_config().split()
    647         if config[0] == b"OpenBLAS":
    648             return config[1].decode("utf-8")

AttributeError: 'NoneType' object has no attribute 'split'

Versions

The scikit-learn version is 1.0.2.
The numpy version is 1.22.2.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

I assume that https://github.com/scikit-learn/scikit-learn/issues/22689#issuecomment-1059234422 solved it. Closing. Feel free to reopen if you consider the issue not fixed.

That’s an issue with threadpoolctl 2.1.0. Upgrading threadpoolctl 3+ should allow you to upgrade numpy as well.

If you do it, I’d be curious to see the output of the same commands, because I can’t reproduce locally and it might still show that something’s wrong the BLAS shipped with numpy.

If using jupyter, restart the kernel after updating threadpoolct1