scipy: BUG: `const` signature changes in `cython_blas` and `cython_lapack` broke backwards compatibility
Describe your issue.
In MNE-Python we run a pip-pre job with thhe latest scipy-wheels-nightly
builds. The latest 1.11.0.dev0+1956.7c74503 pip-pre wheel appears to be buggy:
It worked with 1.11.0.dev0+1926.070b2a8 :
Reproducing Code Example
pip install $STD_ARGS --pre --extra-index-url "https://pypi.anaconda.org/scipy-wheels-nightly/simple" numpy scipy
Then some scipy.linalg stuff... can dig deeper if it's not obvious what the problem is from the traceback
Error message
TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature
________ ERROR at setup of test_make_forward_solution_kit[testing_data] ________
mne/forward/tests/test_make_forward.py:381: in small_surf_src
src = setup_source_space('sample', 'oct2', subjects_dir=subjects_dir,
<decorator-gen-115>:12: in setup_source_space
???
mne/source_space.py:1439: in setup_source_space
s = _create_surf_spacing(surf, hemi, subject, stype, ico_surf,
mne/surface.py:1066: in _create_surf_spacing
mmap = _compute_nearest(from_surf['rr'], ico_surf['rr'])
mne/surface.py:503: in _compute_nearest
tree = _DistanceQuery(xhs, method=method)
mne/surface.py:525: in __init__
from sklearn.neighbors import BallTree
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/neighbors/__init__.py:8: in <module>
from ._distance_metric import DistanceMetric
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/neighbors/_distance_metric.py:4: in <module>
from ..metrics import DistanceMetric as _DistanceMetric
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/__init__.py:42: in <module>
from . import cluster
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/cluster/__init__.py:22: in <module>
from ._unsupervised import silhouette_samples
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/cluster/_unsupervised.py:23: in <module>
from ..pairwise import pairwise_distances_chunked
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/pairwise.py:40: in <module>
from ._pairwise_distances_reduction import ArgKmin
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/_pairwise_distances_reduction/__init__.py:89: in <module>
from ._dispatcher import (
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py:11: in <module>
from ._base import _sqeuclidean_row_norms32, _sqeuclidean_row_norms64
sklearn/metrics/_pairwise_distances_reduction/_base.pyx:1: in init sklearn.metrics._pairwise_distances_reduction._base
???
sklearn/utils/_cython_blas.pyx:1: in init sklearn.utils._cython_blas
???
E TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature (expected __pyx_t_5scipy_6linalg_11cython_blas_d (int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *), got __pyx_t_5scipy_6linalg_11cython_blas_d (int const *, __pyx_t_5scipy_6linalg_11cython_blas_d const *, int const *))
SciPy/NumPy/Python version and system information
Platform Linux-5.15.0-1035-azure-x86_64-with-glibc2.35
Python 3.11.3 (main, Apr 6 2023, 07:55:46) [GCC 11.3.0]
Executable /opt/hostedtoolcache/Python/3.11.3/x64/bin/python
├☑ numpy 1.25.0.dev0+1207.gafa98ddef (OpenBLAS 0.3.21 with 1 thread)
├☑ scipy 1.11.0.dev0+1926.070b2a8
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 41 (38 by maintainers)
Yes I’ll do that this evening, thanks everyone. And apologies for wreaking havoc downstream (sigh… again…)
@ilayn this issue got quite long pretty quickly. How about we close it, and open a new issue with a good summary and a focus on re-introducing the
const
changes?I’m warming up to the idea that we should keep two copies of this file. And use some sort of
cnp.import_array()
type of machinery to use for selecting which one to import. I am not fully certain about the details yet but should not be difficult. Let’s revert it for now but now that everybody is scanning here, we’ll ping everyone when we have a workable plan. All feedback is welcome.And again, please keep us in the loop when you are working on the signatures downstream. We should be able to move together as this affects almost every C-API using package, as I just painfully learned here.
Is there a 160 character summary of the benefits of https://github.com/scipy/scipy/pull/18247? I had a quick look at the top comment in the PR but couldn’t find benchmarks or other hint.
I think for existing scikit-learn releases it is going to be difficult to adjust to this change in the function signature.
For future scikit-learn releases we’d need a way to support both old and new versions of scipy (across this signature change). Does someone have an example of how to do this/existing code to look at?
I forgot. I took a note and will come back to this just trying to stabilize a few open PRs. Please go ahead if you have the idea already otherwise I’ll catch up.
Maybe it’s worth asking if Cython could have some way to optionally disable that runtime signature check in specific cases (e.g. for adding const)? From what I understand if the runtime check is relaxed, things might just work?
When possible, Scikit-learn prefers to use
const T *
to be compatible with read-only data. When we need to use SciPy’s BLAS API, we cast the pointer toT *
to work with SciPy’s previous non-const signatures. Scikit-learn does not modify SciPy’s BLAS signatures.Using
cibuildwheel
, I built a scikit-learn wheel with SciPy 1.11.dev0. Everything works when 1.11.dev0 is installed. When I downgrade to 1.10.1, I get the following error:Note, this is the reverse error from mne-python’s failing test run:
Aside from feasibility, I think it would make the number of failure modes blow up even more.
conda(-forge) can handle this much more easily (e.g. patching in a constraint
scipy <1.11
to already published packages retroactively), so it shouldn’t be the main concern. pip is harder because now libraries like sklearn would need to carry an upper bound for scipy at least for a while, so that when a breaking scipy release comes it doesn’t blow up existing installs – that is, unless sklearn throws away all scipy version compatibility and goes forscipy >=1.11
directly.Assuming such a harsh break is not in the cards, such a path would be quite painful regardless when & how it’s rolled out, so I’m wondering if we can turn it around by doing some C-API versioning like numpy does: e.g. if we encounter something compiled against a newer API version that what (an old) scipy has at runtime, then we error out. The main issue is that we’d need to leave these markers in the compiled sources somewhere and then check against them, and of course do that for a while in published versions before we can rely on it.
I think this is almost certainly going to be the case. I can check if people really want to know. But that would be too disruptive to the ecosystem I think. FYI it shows up on Windows now, too (haven’t tested macOS):
https://dev.azure.com/mne-tools/mne-python/_build/results?buildId=24921&view=logs&jobId=305851a9-a7bb-55db-0042-7e2b6f48aa1c&j=305851a9-a7bb-55db-0042-7e2b6f48aa1c&t=4d37777d-f36a-53aa-9217-6386d15dddcd
My vote is to revert the
int const *
change until we can figure out a plan with downstream libraries on how to handle this without causing these problems.Yes,
$ git clean -xdf && python dev.py build
does it, thank you.I’m running into this on a bog-standard dev setup on linux with mamba. Basically,
git fetch upstream && git rebase upstream/main
breaks any import from scipy.I was very happy to see the const added to the signatures. My limited understanding is that this is a build problem, not a runtime problem. So it should not be that hard to fix (on the scikit-learn side). Scipy now is (a tiny bit) more correct 😄
@betatim @rkern gave a summary here: https://github.com/scipy/scipy/issues/14262#issuecomment-1528611807
https://github.com/scikit-learn/scikit-learn/issues/26290#issue-1687159835 has the error and steps to reproduce the problem.
I think what happens is that the scikit-learn nightly wheel is built with scipy 1.10.1 and I think it is then used with the latest scipy.
Another case where I am not sure what will happen is something like scikit-learn 1.2.2 where the wheel will have been built with a version of scipy that contains the old signature. Will the existing scikit-learn 1.2.2 wheel work with the next release of scipy that contains the new signatures?
Agreed https://github.com/scipy/scipy/pull/18247 is probably the culprit because the failure reported there is similar and the signature change is problematic. We’ll see what the scikit-learn folks say, more likely they need to change something at their end to accommodate this signature change. I guess older SciPy wheels could have the old signature and newer wheels the new one so they’ll have to be able/willing to use either of them.