scikit-learn: ARDRegression still crashes when trained on some constant y

Description

Recently, it was reported in #10092 that BayesRidge and ARDRegression have an issue with training on constant labels. There was a fix made in #10095. It seems it is not completely fixed. I have found some specific examples under which it fails, although it works fine for many examples.

Steps/Code to Reproduce

Download the pickle file here: https://www.dropbox.com/s/ytb7y2o4ij8kwbu/ard_bug_data.pickle?dl=0 The following code will error out:

import pickle
from sklearn.linear_model import ARDRegression

with open('ard_bug_data.pickle', 'rb') as f:
    Xtr, ytr, Xts = pickle.load(f)
    
ard = ARDRegression()
ard.fit(Xtr, ytr)
mus, stds = ard.predict(Xts, return_std=True)

Expected Results

No error is thrown. mus is a vector of of 97 values that are in ytr. stds should likely be zeros, but I am not sure.

Actual Results

The error is as follows:

Traceback (most recent call last):

  File "<ipython-input-8-acf07ddd7964>", line 10, in <module>
    ard.predict(Xts, return_std=True)

  File "c:\users\sergey\github\scikit-learn\sklearn\linear_model\bayes.py", line 540, in predict
    sigmas_squared_data = (np.dot(X, self.sigma_) * X).sum(axis=1)

ValueError: shapes (97,0) and (1,1) not aligned: 0 (dim 1) != 1 (dim 0)

Versions

Windows-10-10.0.15063-SP0
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)]
NumPy 1.13.3
SciPy 0.19.1
Scikit-Learn 0.20.dev0

cc @glemaitre

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 28 (28 by maintainers)

Most upvoted comments

So in the current code, the only thing missing is to update coef_ and sigma_ once that the for loop is finished.

So I would probably move the estimate of coef_ and lambda_ into a function and call it in the for loop and once after the for.

I also so that we have a different criterion to stop the iteration. In the code of tipping it would correspond to:

np.max(np.abs(np.log(lambda_[keep_lambda]) - np.log(old_lambda_[keep_lambda]))

Not sure what is the best.

                                                                                  Naively ‎I would do that but I would also check the original paper stated in the user guide to know that we don't mess up. @agramfort will for sure know better than me the algorithm.