scikit-learn: Bug in AUC metric when TP = 100%?
As an example, this works correctly:
In [13]: import numpy as np
In [14]: from sklearn import metrics
In [15]: true = [1, 1, 1, 1, 1, 1, 1, 1, 1, 0.99]
In [16]: pred = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
In [17]: fpr, tpr, thresholds = metrics.roc_curve(true, pred)
In [18]: metrics.auc(fpr, tpr)
Out[18]: 0.22222222222222221
However, if there are no true negatives (e.g. there is only one class), an error is thrown:
In [19]: true = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
In [20]: pred = [0, 1, 0, 1, 0, 1, 0, 1, 0, 1]
In [21]: fpr, tpr, thresholds = metrics.roc_curve(true, pred)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-21-35631f51a7c5> in <module>()
----> 1 fpr, tpr, thresholds = metrics.roc_curve(true, pred)
132 # ROC only for binary classification
133 if classes.shape[0] != 2:
--> 134 raise ValueError("ROC is defined for binary classification only")
135
136 y_score = np.ravel(y_score)
ValueError: ROC is defined for binary classification only
Is this the correct behavior?
About this issue
- Original URL
- State: closed
- Created 12 years ago
- Comments: 24 (19 by maintainers)
sklearn.metrics.roc_auc_score()
is not defined when no positive example is in the ground truth for a given label (and symmetrically same issue when no negative example is in the ground truth).E.g.
yields:
because the 5th label is always absent in the ground truth (i.e.
y_true
).I guess it makes sense that the AUROC is undefined when no positive example is the ground truth for a given label:
Since the test set contains no positive example, then TP = FN = 0. This means that the TPR is undefined (division by zero), which means that the ROC cannot the plotted, which means that AUROC is undefined.
That being said, in the case of multilabel classification, that would be nice if
sklearn.metrics.roc_auc_score()
could kindly return a warning as well as the AUROCs for the non-problematic labels instead of throwing an error (likesklearn.metrics.f1_score()
does:UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in samples with no true labels.
andUndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in samples with no predicted labels.
). Or perhaps adding some option likeignore_monoclass_label=True
. This way we wouldn’t have to eliminate the labels with no positive example ourselves before callingsklearn.metrics.roc_auc_score()
.The error message could be more explicit in the multilabel case.
On 1 November 2014 21:23, Joel Nothman joel.nothman@gmail.com wrote:
I presume it’s a multilabel problem in which some label lacks either positive or negative instances.
On 31 October 2014 23:12, Arnaud Joly notifications@github.com wrote: