scikit-learn: pca.fit_transform returns error: array must not contain infs or NaNs
When I call PCA’s fit_transform method I am getting error array must not contain infs or NaNs
Actual code:
sklearn_pca = PCA(n_components = 3) input_vec = sklearn_pca.fit_transform(normalised_tfidf)
Traceback (most recent call last):
File "C:\Users\User\workspace\caseStudy\main.py", line 135, in <module>
input_vec = sklearn_pca.fit_transform(normalised_tfidf)
File "C:\Users\User\anaconda3\lib\site-packages\sklearn\decomposition\_pca.py", line 369, in fit_transform
U, S, V = self._fit(X)
File "C:\Users\User\anaconda3\lib\site-packages\sklearn\decomposition\_pca.py", line 418, in _fit
return self._fit_truncated(X, n_components, self._fit_svd_solver)
File "C:\Users\User\anaconda3\lib\site-packages\sklearn\decomposition\_pca.py", line 532, in _fit_truncated
U, S, V = randomized_svd(X, n_components=n_components,
File "C:\Users\User\anaconda3\lib\site-packages\sklearn\utils\extmath.py", line 354, in randomized_svd
Uhat, s, V = linalg.svd(B, full_matrices=False)
File "C:\Users\User\anaconda3\lib\site-packages\scipy\linalg\decomp_svd.py", line 109, in svd
a1 = _asarray_validated(a, check_finite=check_finite)
File "C:\Users\User\anaconda3\lib\site-packages\scipy\_lib\_util.py", line 246, in _asarray_validated
a = toarray(a)
File "C:\Users\User\anaconda3\lib\site-packages\numpy\lib\function_base.py", line 498, in asarray_chkfinite
raise ValueError(
ValueError: array must not contain infs or NaNs
Checked if there are infs and NaNs in the input array: np.any(np.isnan(normalised_tfidf)) Out[2]: False
np.any(np.isinf(normalised_tfidf)) Out[3]: False
Versions: Python: 3.8 Anaconda: 1.9.12 Sklearn: 0.23.1
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 16 (2 by maintainers)
Issue resolved. Can not reproduce anymore
Hello,
can you share the solution, pls? As from sklearn > 0.22.1 I have the same issue with many random datasets which never produced such errors before.
Oddly enough, if I add a simple loop e.g.: try: x_pca = pca.fit_transform(PCA_data) except: x_pca = pca.fit_transform(PCA_data)
then with the second run it ALWAYS passes without error…
As I saw people discussing this issue on forums, I believe it is worthy to solve it in a general way.
@MichalRIcar I just did and it still doesn’t work 😕
May you provide the dataset that caused the issue so that we can reproduce it?