scipy: BUG: scipy's pdist(X, metric='dice') VS scipy.spatial.distance.dice() produce different results
Describe your issue.
Please forgive me if this is just a lack of my understanding of the difference between the two function but I can’t seem to find why these would be different. the code should make it clear, but basically I am confused why there is a different result (for the last set [1, 0, 0], [2, 0, 0]
for these two metrics that I assume should be the same.
thanks!
Reproducing Code Example
#pdist
from scipy.spatial.distance import pdist
X = [[1, 0, 0], [0, 1, 0]]
tmp1 = pdist(X, metric='dice')
print(tmp1)
# [1.]
X = [[1, 0, 0], [1, 1, 0]]
tmp1 = pdist(X, metric='dice')
print(tmp1)
# [0.33333333]
X = [[1, 0, 0], [2, 0, 0]]
tmp1 = pdist(X, metric='dice')
print(tmp1)
# [0.]
#dice
from scipy.spatial import distance
tmp1 = distance.dice([1, 0, 0], [0, 1, 0])
print(tmp1)
# 1.0
tmp1 = distance.dice([1, 0, 0], [1, 1, 0])
print(tmp1)
# 0.3333333333333333
tmp1 = distance.dice([1, 0, 0], [2, 0, 0])
print(tmp1)
# -0.3333333333333333
Error message
None
SciPy/NumPy/Python version information
1.8.0 1.21.6 sys.version_info(major=3, minor=8, micro=4, releaselevel=‘final’, serial=0)
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 15 (11 by maintainers)
Commits related to this issue
- BUG: PR 17538 revisions * add regression test and reviewer-suggested fix for gh-17703 — committed to jjerphan/scipy by tylerjereddy a year ago
Hi @tsabbir96 I think @peterbell10 wanted to fix this along with #17538. Can you confirm Peter?
I think handling this in the wrapper is good. Type promotion is okay, and AFAIK all the
_distance_pybind
functions promote to floating point. We really only want to fall back when the input types cannot safely be cast to the supported type, which is mainly boolean metrics.