scipy: BUG: breaking change: scipy.stats.mode returned value has changed shape without deprecation
Describe your issue.
The scikit-learn nightly CI started to fail because of a recent change in scipy main
:
In scipy main
:
import numpy as np
from scipy import stats
stats.mode(np.ones(3))[0]
returns 1.0
while previously it would return:
array([1.])
Maybe a deprecation cycle is needed for this change?
Reproducing Code Example
import numpy as np
import scipy
scipy.stats.mode(np.ones(3))[0][0]
Error message
IndexError: invalid index to scalar variable.
SciPy/NumPy/Python version information
1.10.0.dev0+0.df3fe4e 1.24.0.dev0+102.gb84a53df9 sys.version_info(major=3, minor=10, micro=4, releaselevel=‘final’, serial=0)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 21 (20 by maintainers)
Commits related to this issue
- Make call to scipy.stats.mode compatible with scipy 1.11.0 Starting with scipy 1.11.0, scipy.stats.mode returns a scalar value for the mode instead of a one-element array. This change was foreshadowe... — committed to HippocampusGirl/sdcflows by HippocampusGirl a year ago
- Update util.py bug fix: scipy 1.9.0 introduced a change in the return shape of scipy.stats.mode. more info here: https://github.com/scipy/scipy/issues/16418 — committed to gerberlab/MDSINE2 by demacdo 8 months ago
@ogrisel I opened gh-16429 to work on this. It reverts the changes of gh-15423 so that the default behavior of mode is to retain the axis it acts along (as it was), but warns that this will change in SciPy 1.11.0. It hastily adds a
keepdims
parameter, but the behavior in edge cases is not perfect. I wonder if it would work to just revert gh-15423 and warn about the changing behavior but not offerkeepdims
. Would that work for sckit-learn?Sorry, I’m late to this party and missing it because it in my release blockers review since it didn’t have a
1.9.0
milestone.It seems pretty clear to me that we should revert this for
1.9.0
. It’s somewhere in between a regular bug fix and a bc-breaking change, but either way that is pretty much irrelevant here. If we’re breaking scikit-learn, we just shouldn’t and it’s easy enough to give them one release cycle to fix things up.@ogrisel it would be great to make the code in scikit-learn robust either way; no version checking is needed, a simple
atleast_1d
call on the result of thatstats.mode
call will work in all cases.Actually we will need to release a scikit-learn 1.1.2 compat bugfix before numpy 1.24 is released anyways because of https://github.com/scikit-learn/scikit-learn/pull/23654.
So feel free to do as you prefer, independently of scikit-learn.
We could do a deprecation cycle, but please see the mailing list post and related issue gh-15423.
Previously,
mode
retained the dimension it acted along, unlike almost all (all?) otherscipy.stats
reduction operations. Due to the inconsistency, this behavior was considered a bug, so it changed without a deprecation cycle. Akeepdims
parameter was added to mitigate the change, but the default is currentlyFalse
. The change will be documented in the 1.9.0 release notes (see gh-16248 - although it’s in the enhancement section only right now; it would need to be added to the Backward Incompatible changes section).We could make the default value
keepdims=True
formode
and raise a warning about a future change in behavior, but that will still probably break dependent library CI, and the changes required would be the same: applykeepdims=True
or change indexing.Opinions here?
Note before this is closed: even if we don’t make a change in behavior here, we need to copy the release note entry about the change into the Backward Incompatibility section.