scikit-learn: average_precision_score does not return correct AP when all negative ground truth labels
Description
average_precision_score
does not return correct AP when y_true
is all negative labels.
Steps/Code to Reproduce
One can run this piece of dummy code:
sklearn.metrics.ranking.average_precision_score(np.array([0, 0, 0, 0, 0]), np.array([0.1, 0.1, 0.1, 0.1, 0.1]))
It returns nan
instead the correct value with the error:
RuntimeWarning: invalid value encountered in true_divide
recall = tps / tps[-1]
Expected Results
As per this Stackoverflow answer, Recall = 1 when FN=0, since 100% of the TP were discovered and Precision = 1 when FP=0, since no there were no spurious results.
Actual Results
Current output is:
/usr/local/lib/python3.5/dist-packages/sklearn/metrics/ranking.py:415: RuntimeWarning: invalid value encountered in true_divide
recall = tps / tps[-1]
Out[201]: nan
Versions
Linux-4.4.0-59-generic-x86_64-with-Ubuntu-16.04-xenial Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] NumPy 1.12.0 SciPy 0.18.1 Scikit-Learn 0.18.1
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 10
- Comments: 22 (11 by maintainers)
It’s been 5 years now with this issue. I’ve opened the PR and updated it countless times but the only blocker is the approving review.
No it doesn’t. I think we just need to have to do something like this to handle this edge case:
@varunagrawal if you do a PR please add a non-regression test with only zeros in
y_true
.Hi @lesteve, are you planning to approve this PR? Please let us know. Thank you!
Any update on this issue?