scipy: BUG: `const` signature changes in `cython_blas` and `cython_lapack` broke backwards compatibility

Describe your issue.

In MNE-Python we run a pip-pre job with thhe latest scipy-wheels-nightly builds. The latest 1.11.0.dev0+1956.7c74503 pip-pre wheel appears to be buggy:

It worked with 1.11.0.dev0+1926.070b2a8 :

Reproducing Code Example

pip install $STD_ARGS --pre --extra-index-url "https://pypi.anaconda.org/scipy-wheels-nightly/simple" numpy scipy

Then some scipy.linalg stuff... can dig deeper if it's not obvious what the problem is from the traceback

Error message

TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature
________ ERROR at setup of test_make_forward_solution_kit[testing_data] ________
mne/forward/tests/test_make_forward.py:381: in small_surf_src
    src = setup_source_space('sample', 'oct2', subjects_dir=subjects_dir,
<decorator-gen-115>:12: in setup_source_space
    ???
mne/source_space.py:1439: in setup_source_space
    s = _create_surf_spacing(surf, hemi, subject, stype, ico_surf,
mne/surface.py:1066: in _create_surf_spacing
    mmap = _compute_nearest(from_surf['rr'], ico_surf['rr'])
mne/surface.py:503: in _compute_nearest
    tree = _DistanceQuery(xhs, method=method)
mne/surface.py:525: in __init__
    from sklearn.neighbors import BallTree
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/neighbors/__init__.py:8: in <module>
    from ._distance_metric import DistanceMetric
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/neighbors/_distance_metric.py:4: in <module>
    from ..metrics import DistanceMetric as _DistanceMetric
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/__init__.py:42: in <module>
    from . import cluster
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/cluster/__init__.py:22: in <module>
    from ._unsupervised import silhouette_samples
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/cluster/_unsupervised.py:23: in <module>
    from ..pairwise import pairwise_distances_chunked
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/pairwise.py:40: in <module>
    from ._pairwise_distances_reduction import ArgKmin
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/_pairwise_distances_reduction/__init__.py:89: in <module>
    from ._dispatcher import (
/opt/hostedtoolcache/Python/3.11.3/x64/lib/python3.11/site-packages/sklearn/metrics/_pairwise_distances_reduction/_dispatcher.py:11: in <module>
    from ._base import _sqeuclidean_row_norms32, _sqeuclidean_row_norms64
sklearn/metrics/_pairwise_distances_reduction/_base.pyx:1: in init sklearn.metrics._pairwise_distances_reduction._base
    ???
sklearn/utils/_cython_blas.pyx:1: in init sklearn.utils._cython_blas
    ???
E   TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature (expected __pyx_t_5scipy_6linalg_11cython_blas_d (int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *), got __pyx_t_5scipy_6linalg_11cython_blas_d (int const *, __pyx_t_5scipy_6linalg_11cython_blas_d const *, int const *))

SciPy/NumPy/Python version and system information

Platform                Linux-5.15.0-1035-azure-x86_64-with-glibc2.35
Python                  3.11.3 (main, Apr  6 2023, 07:55:46) [GCC 11.3.0]
Executable              /opt/hostedtoolcache/Python/3.11.3/x64/bin/python
├☑ numpy                1.25.0.dev0+1207.gafa98ddef (OpenBLAS 0.3.21 with 1 thread)
├☑ scipy                1.11.0.dev0+1926.070b2a8

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 41 (38 by maintainers)

Most upvoted comments

How about we close it, and open a new issue with a good summary and a focus on re-introducing the const changes?

Yes I’ll do that this evening, thanks everyone. And apologies for wreaking havoc downstream (sigh… again…)

@ilayn this issue got quite long pretty quickly. How about we close it, and open a new issue with a good summary and a focus on re-introducing the const changes?

I’m warming up to the idea that we should keep two copies of this file. And use some sort of cnp.import_array() type of machinery to use for selecting which one to import. I am not fully certain about the details yet but should not be difficult. Let’s revert it for now but now that everybody is scanning here, we’ll ping everyone when we have a workable plan. All feedback is welcome.

And again, please keep us in the loop when you are working on the signatures downstream. We should be able to move together as this affects almost every C-API using package, as I just painfully learned here.

Is there a 160 character summary of the benefits of https://github.com/scipy/scipy/pull/18247? I had a quick look at the top comment in the PR but couldn’t find benchmarks or other hint.

I think for existing scikit-learn releases it is going to be difficult to adjust to this change in the function signature.

For future scikit-learn releases we’d need a way to support both old and new versions of scipy (across this signature change). Does someone have an example of how to do this/existing code to look at?

I forgot. I took a note and will come back to this just trying to stabilize a few open PRs. Please go ahead if you have the idea already otherwise I’ll catch up.

Maybe it’s worth asking if Cython could have some way to optionally disable that runtime signature check in specific cases (e.g. for adding const)? From what I understand if the runtime check is relaxed, things might just work?

In sklearn case they apparently also modified the signatures which is a different story altogether.

When possible, Scikit-learn prefers to use const T * to be compatible with read-only data. When we need to use SciPy’s BLAS API, we cast the pointer to T * to work with SciPy’s previous non-const signatures. Scikit-learn does not modify SciPy’s BLAS signatures.

First we need confirmation that a sklearn (or any other package) built with new SciPy with const fails to work with an old SciPy at runtime before anything.

Using cibuildwheel, I built a scikit-learn wheel with SciPy 1.11.dev0. Everything works when 1.11.dev0 is installed. When I downgrade to 1.10.1, I get the following error:

TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature
(expected __pyx_t_5scipy_6linalg_11cython_blas_d (int const *, __pyx_t_5scipy_6linalg_11cython_blas_d const *, int const *),
got __pyx_t_5scipy_6linalg_11cython_blas_d (int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *))

Note, this is the reverse error from mne-python’s failing test run:

TypeError: C function scipy.linalg.cython_blas.dasum has wrong signature
(expected __pyx_t_5scipy_6linalg_11cython_blas_d (int *, __pyx_t_5scipy_6linalg_11cython_blas_d *, int *),
got __pyx_t_5scipy_6linalg_11cython_blas_d (int const *, __pyx_t_5scipy_6linalg_11cython_blas_d const *, int const *))

We can backport this to past version too but I don’t know how feasible it is

Aside from feasibility, I think it would make the number of failure modes blow up even more.

at some point we have to bite the bullet by not supporting old SciPy hence the conda or pip install need to upgrade SciPy too

conda(-forge) can handle this much more easily (e.g. patching in a constraint scipy <1.11 to already published packages retroactively), so it shouldn’t be the main concern. pip is harder because now libraries like sklearn would need to carry an upper bound for scipy at least for a while, so that when a breaking scipy release comes it doesn’t blow up existing installs – that is, unless sklearn throws away all scipy version compatibility and goes for scipy >=1.11 directly.

Assuming such a harsh break is not in the cards, such a path would be quite painful regardless when & how it’s rolled out, so I’m wondering if we can turn it around by doing some C-API versioning like numpy does: e.g. if we encounter something compiled against a newer API version that what (an old) scipy has at runtime, then we error out. The main issue is that we’d need to leave these markers in the compiled sources somewhere and then check against them, and of course do that for a while in published versions before we can rely on it.

Also, if newer versions of SciPy featuring these changes were to be released, errors might be present with older versions of scikit-learn (i.e. scikit-learn<=1.2.2

I think this is almost certainly going to be the case. I can check if people really want to know. But that would be too disruptive to the ecosystem I think. FYI it shows up on Windows now, too (haven’t tested macOS):

https://dev.azure.com/mne-tools/mne-python/_build/results?buildId=24921&view=logs&jobId=305851a9-a7bb-55db-0042-7e2b6f48aa1c&j=305851a9-a7bb-55db-0042-7e2b6f48aa1c&t=4d37777d-f36a-53aa-9217-6386d15dddcd

My vote is to revert the int const * change until we can figure out a plan with downstream libraries on how to handle this without causing these problems.

Yes, $ git clean -xdf && python dev.py build does it, thank you.

I’m running into this on a bog-standard dev setup on linux with mamba. Basically, git fetch upstream && git rebase upstream/main breaks any import from scipy.

I was very happy to see the const added to the signatures. My limited understanding is that this is a build problem, not a runtime problem. So it should not be that hard to fix (on the scikit-learn side). Scipy now is (a tiny bit) more correct 😄

@betatim @rkern gave a summary here: https://github.com/scipy/scipy/issues/14262#issuecomment-1528611807

Because of the way Cython interprets what is an acceptable ndarray to pass to those arguments. Without const, Cython checks the writable flag and forbids read-only ndarrays. Adding const (is supposed to) just tell Cython to allow either read-only or writable ndarrays. So we check manually whether the specific subroutine is expecting to write to the passed array, and if not, then we should add the const.

https://github.com/scikit-learn/scikit-learn/issues/26290#issue-1687159835 has the error and steps to reproduce the problem.

I think what happens is that the scikit-learn nightly wheel is built with scipy 1.10.1 and I think it is then used with the latest scipy.

Another case where I am not sure what will happen is something like scikit-learn 1.2.2 where the wheel will have been built with a version of scipy that contains the old signature. Will the existing scikit-learn 1.2.2 wheel work with the next release of scipy that contains the new signatures?

Agreed https://github.com/scipy/scipy/pull/18247 is probably the culprit because the failure reported there is similar and the signature change is problematic. We’ll see what the scikit-learn folks say, more likely they need to change something at their end to accommodate this signature change. I guess older SciPy wheels could have the old signature and newer wheels the new one so they’ll have to be able/willing to use either of them.