scikit-learn: ARDRegression still crashes when trained on some constant y
Description
Recently, it was reported in #10092 that BayesRidge
and ARDRegression
have an issue with training on constant labels. There was a fix made in #10095. It seems it is not completely fixed. I have found some specific examples under which it fails, although it works fine for many examples.
Steps/Code to Reproduce
Download the pickle file here: https://www.dropbox.com/s/ytb7y2o4ij8kwbu/ard_bug_data.pickle?dl=0 The following code will error out:
import pickle
from sklearn.linear_model import ARDRegression
with open('ard_bug_data.pickle', 'rb') as f:
Xtr, ytr, Xts = pickle.load(f)
ard = ARDRegression()
ard.fit(Xtr, ytr)
mus, stds = ard.predict(Xts, return_std=True)
Expected Results
No error is thrown. mus
is a vector of of 97 values that are in ytr
. stds
should likely be zeros, but I am not sure.
Actual Results
The error is as follows:
Traceback (most recent call last):
File "<ipython-input-8-acf07ddd7964>", line 10, in <module>
ard.predict(Xts, return_std=True)
File "c:\users\sergey\github\scikit-learn\sklearn\linear_model\bayes.py", line 540, in predict
sigmas_squared_data = (np.dot(X, self.sigma_) * X).sum(axis=1)
ValueError: shapes (97,0) and (1,1) not aligned: 0 (dim 1) != 1 (dim 0)
Versions
Windows-10-10.0.15063-SP0
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)]
NumPy 1.13.3
SciPy 0.19.1
Scikit-Learn 0.20.dev0
cc @glemaitre
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 28 (28 by maintainers)
So in the current code, the only thing missing is to update
coef_
andsigma_
once that the for loop is finished.So I would probably move the estimate of coef_ and lambda_ into a function and call it in the for loop and once after the for.
I also so that we have a different criterion to stop the iteration. In the code of tipping it would correspond to:
Not sure what is the best.