scikit-learn: fowlkes_mallows_score returns nan in binary classification

Description

fowlkes_mallows_score doesn’t work properly for large binary classification vectors. It returns values that are not between 0 and 1 or returns nan. In general, the equation shown in the documentation doesn’t yield the same results as the function.

Steps/Code to Reproduce

Edited by @jnothman: this reference implementation is incorrect. See comment below.

import sklearn
import numpy as np
def get_FMI(true,predicted):
    c = sklearn.metrics.confusion_matrix(true,predicted)
    TP = c[1][1]
    FP = c[0][1]
    FN = c[1][0]
    FMI = TP / np.sqrt((TP + FP) * (TP + FN))

    print('Should be', FMI)
    print('Is', sklearn.metrics.fowlkes_mallows_score(true, predicted))
    
# large vector
get_FMI(np.random.choice([0,1], 1362),np.random.choice([0,1], 1362))
# small vector
get_FMI(np.random.choice([0,1], 100),np.random.choice([0,1], 100))

Expected Results

Should be 0.487888392921 Is 0.487888392921

Should be 0.548853049023 Is 0.548853049023

Actual Results

Should be 0.487888392921 Is 15.3260054113

Should be 0.548853049023 Is 0.501109879279

Versions

Windows-10-10.0.10586-SP0 Python 3.5.2 |Anaconda custom (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)] NumPy 1.11.2 SciPy 0.18.1 Scikit-Learn 0.18.1

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 23 (19 by maintainers)

Commits related to this issue

[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to scikit-learn/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to raghavrv/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to sergeyf/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to Sundrique/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to NelleV/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to paulha/scikit-learn by devanshdalal 7 years ago
[MRG+1] fowlkes_mallows_score: more unit tests (Fixes #8101) (#8140) — committed to maskani-moh/scikit-learn by devanshdalal 7 years ago

Most upvoted comments

tk, pk and qk follow the same equation as the reference given in the documentation. The above code gave me an error on tk / np.sqrt(pk * qk) if tk != 0. else 0. which I could fix with tk / np.sqrt(pk) / np.sqrt(qk) if tk != 0. else 0.

gan3sh500 on Dec 28, 2016