scikit-learn: average_precision_score does not return correct AP when all negative ground truth labels

Description

average_precision_score does not return correct AP when y_true is all negative labels.

Steps/Code to Reproduce

One can run this piece of dummy code:

sklearn.metrics.ranking.average_precision_score(np.array([0, 0, 0, 0, 0]), np.array([0.1, 0.1, 0.1, 0.1, 0.1]))

It returns nan instead the correct value with the error:

RuntimeWarning: invalid value encountered in true_divide
recall = tps / tps[-1]

Expected Results

As per this Stackoverflow answer, Recall = 1 when FN=0, since 100% of the TP were discovered and Precision = 1 when FP=0, since no there were no spurious results.

Actual Results

Current output is:

/usr/local/lib/python3.5/dist-packages/sklearn/metrics/ranking.py:415: RuntimeWarning: invalid value encountered in true_divide
  recall = tps / tps[-1]
Out[201]: nan

Versions

Linux-4.4.0-59-generic-x86_64-with-Ubuntu-16.04-xenial Python 3.5.2 (default, Nov 17 2016, 17:05:23) [GCC 5.4.0 20160609] NumPy 1.12.0 SciPy 0.18.1 Scikit-Learn 0.18.1

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 10
  • Comments: 22 (11 by maintainers)

Most upvoted comments

It’s been 5 years now with this issue. I’ve opened the PR and updated it countless times but the only blocker is the approving review.

Can you check if #7356 fixes this?

No it doesn’t. I think we just need to have to do something like this to handle this edge case:

recall = 1 if tps[-1] == 0 else tps / tps[-1] 

@varunagrawal if you do a PR please add a non-regression test with only zeros in y_true.

Hi @lesteve, are you planning to approve this PR? Please let us know. Thank you!

Any update on this issue?