scikit-learn: fowlkes_mallows_score returns nan in binary classification
Description
fowlkes_mallows_score doesn’t work properly for large binary classification vectors. It returns values that are not between 0 and 1 or returns nan
. In general, the equation shown in the documentation doesn’t yield the same results as the function.
Steps/Code to Reproduce
Edited by @jnothman: this reference implementation is incorrect. See comment below.
import sklearn
import numpy as np
def get_FMI(true,predicted):
c = sklearn.metrics.confusion_matrix(true,predicted)
TP = c[1][1]
FP = c[0][1]
FN = c[1][0]
FMI = TP / np.sqrt((TP + FP) * (TP + FN))
print('Should be', FMI)
print('Is', sklearn.metrics.fowlkes_mallows_score(true, predicted))
# large vector
get_FMI(np.random.choice([0,1], 1362),np.random.choice([0,1], 1362))
# small vector
get_FMI(np.random.choice([0,1], 100),np.random.choice([0,1], 100))
Expected Results
Should be 0.487888392921 Is 0.487888392921
Should be 0.548853049023 Is 0.548853049023
Actual Results
Should be 0.487888392921 Is 15.3260054113
Should be 0.548853049023 Is 0.501109879279
Versions
Windows-10-10.0.10586-SP0 Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] NumPy 1.11.2 SciPy 0.18.1 Scikit-Learn 0.18.1
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 23 (19 by maintainers)
Commits related to this issue
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to scikit-learn/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to raghavrv/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to sergeyf/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to Sundrique/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to NelleV/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to paulha/scikit-learn by devanshdalal 7 years ago
- [MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to maskani-moh/scikit-learn by devanshdalal 7 years ago
tk, pk and qk follow the same equation as the reference given in the documentation. The above code gave me an error on
tk / np.sqrt(pk * qk) if tk != 0. else 0.
which I could fix withtk / np.sqrt(pk) / np.sqrt(qk) if tk != 0. else 0.