scipy: Too-strict version restrictions on NumPy break distribution builds

I asked the same question over at scikit-image, but I’m still not clear on where there is a hard requirement on a very specific version of NumPy in the pyproject.toml for both of these packages. Having this specifc version of NumPy breaks Void Linux builds; patching out the version restriction doesn’t seem to break builds.

If there are actual API changes that break SciPy builds, a range of supported versions seems more appropriate. If this restriction is instead used to enforce a specific version for isolated wheel builds, there must be a more appropriate way to add these restrictions to the venv.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 26 (19 by maintainers)

Most upvoted comments

When the SciPy organization builds wheels for distribution, the virtual environment used to create those wheels should be responsible for installing the oldest supported NumPy to provide the wheel with the broadest compatibility.

pyproject.toml exists in part to avoid exactly that–it encodes our most portable wheel specification in a standardized way, so I’m pretty strongly against the suggestions to our distribution policy being made here. If you want to build a less-portable version of SciPy from source, with whatever dependencies you feel like, go ahead and disable the build isolation and make your own custom non-standard-compliant environment, rather than asking us to change our standard approach to distribution.

From the pip install docs, use this option:

–no-build-isolation

Disable isolation when building a modern source distribution. Build dependencies specified by PEP 518 must be already installed if this option is used.

https://peps.python.org/pep-0518/

Put whatever you want in your virtual environment.

It seems crazy to me that we’d even consider changing our pyproject.toml in some custom way when downstream consumers (including other distributions) can just use a simple command line option built into the ecosystem to do whatever they want (at their own peril, but the flexibility is yours). For those who trust our judgement, or don’t even know what is going on under the hood, the current default to make the most portable wheel makes the most sense to me by quite a wide margin.

I’ll make a few points briefly, because I’m short on time:

Most important point: NumPy 1.25.0 changes how all this works, so for the next release we will drop these pins (and hence we will not make minor tweaks now, that’s pretty much pointless). See https://numpy.org/doc/1.25/dev/depending_on_numpy.html#adding-a-dependency-on-numpy for context.

But, anyone who triggers the building-from-source path without knowing, is already, most likely, going to get build errors for about a dozen different reasons.

This isn’t quite true on Linux, things tend to work reasonably well - users typically have compilers installed, so if one also has BLAS/LAPACK, the build typically works. And users on platforms like ppc64le or Alpine, for which we don’t supply wheels, kinda rely on this stuff.

I understand the need to support the oldest possible version for the wheels that you distribute. However, the == is still not the right way to handle this:

  1. As you note, any other 1.23.x version of NumPy should be sufficient, but any PEP517 builder will fail unless it finds the exact version specified. Even this requires patching of pyproject.toml.
  2. In a source distribution (which is what Void uses to build its package), there is no need to compel building against the oldest supported NumPy version. Void is a rolling-release distribution and provides a fairly up-to-date version of NumPy—currently 1.25.0. When we update SciPy, it is built with the currently packaged version of NumPy, and future updates of NumPy (should) retain backwards compatibility for SciPy at least for some deprecation period. There is no need for Void to support older versions because we only guarantee consistent functionality for all current package versions. (Generally, Void allows piecemeal package updates, but the guiding principle with partial updates is caveat emptor.)

I submit that the right approach here is to release source distributions with pyproject.toml restrictions that are as loose as possible. In general, this should be numpy >= 1.23.2 for Python 3.11 (unless you know that some future release actively breaks buildability, in which case the addition of an upper bound would be appropriate). Anybody building from source should be able to build with whatever version of NumPy is sufficiently new rather than exactly specified. When the SciPy organization builds wheels for distribution, the virtual environment used to create those wheels should be responsible for installing the oldest supported NumPy to provide the wheel with the broadest compatibility.

It doesn’t. Our actual build-time requirement on numpy is the same as our runtime requirement, so for 1.11.0 it’s:

numpy>=1.21.6,<1.28.0

We just don’t have a way to express that in metadata.

But this is my central argument: that build-system.requires is the place where numpy>=1.21.6,<1.28.0 should be listed to express the actual requirements, while the desire to build distributable wheels against numpy==1.23.2 for broad and supported compatibility belongs… somewhere more appropriate.

I appreciate the discussion, which was always philosophical as I’ve already worked around the issue for Void. My intent was to seek clarification on the purpose of these pins and, once their role in wheel building was explained, to illuminate the conflicting constraints of wheel distribution and local building.

As eli-schwartz notes, disabling dependency checks is not desirable because they help us catch oversights in dependencies that sometimes break obviously but can often lead to subtle issues that are hard to track. For example, in the easy-install days, we had several Void packages that just silently grabbed dependencies over the Internet instead of building against our packaged versions simply because package authors forgot to include one or more dependencies in our build templates. For our purposes, patching out the version requirement is preferable. I look forward to the day when NumPy 1.25 will make this unnecessary.

The priority on PyPI distribution of wheels is consistent within PyPA and is not entirely unreasonable, although it can be frustrating for Linux distributors who are unable to just use pip and move on. Not only do we have to contend incorporating Python packages into our package graphs, but some CPython native extensions simply must be built against system libraries and therefore must contend with non-isolated builds. (For example, I had a private project that used ZeroMQ in a C library and a Python helper that would manipulate the sockets and call the library via a CFFI or pybind11 wrapper. The version of PyZMQ I used absolutely had to be linked with my system ZeroMQ libraries… any prebuilt wheel would return context and socket handles that were completely independent from those in the C library.)

I am aware that PyPA and others involved in the debates that advance expectations and standards for Python packaging perceive a lack of involvement in the process by Linux distributors. While I can’t speak for others, I admit that I am generally aloof until something breaks my workflow. In my defense, the Void Linux team is small and our attention is spread widely; it is difficult to notice and engage in these discussions early enough to avert inconvenience.

Whether or not PEP517 can be interpreted to endorse policy preferences in build requirements, I will note that, of the 213 Python packages in Void Linux that build according to PEP517, SciPy and scikit-image are the only two that do so.

Because PEP517 doesn’t actually encode this use in the standard, we’re forced to fall back on de facto standards. The practice of two (interconnected) projects in a sea of more than 200 cannot reasonably be considered to set the standard here.

We build PEP517 packages (including SciPy, as of today) with python3 -m build --no-isolation --wheel. I’m not quite sure how I feel about --skip-dependency-check. On the one hand, presumably any missing build dependencies will trigger a hard failure at some point in the build, so maybe the checks aren’t necessary. On the other hand, there may be some legitimate restrictions that should be enforced by the dependency check.

@tylerjereddy you are miscontruing the purpose of PEP518. The purpose of build-system.requires in pyproject.toml is exactly what its name implies: a specification of the requirements to build a Python package, not a representation of policy. Encoding specific versions in this field to enforce a compatibility policy is both nonstandard and an abuse of the field. You’re doing this to make up for a lack of accommodations for these kinds of policy preferences in PEP517/518. This is yet another example of Python packaging running amok without proper regard for distribution packaging. Their general attitude of “just live in our ecosystem” is hard to tolerate in a world that supports more than pure Python packages, where disribution packaging can be important to providing support for multiple architectures and C libraries and even optional dependencies in a sane manner.

I’m not asking you to consider changing pyproject.toml “in some custom way”, and we already build with isolation disables. The issue here is that SciPy actually breaks the true intent of build-system.requires. I’m asking you to leave requires strictly for requirements (which, by your project’s own admission in the comments about “Python versions which aren’t officially supports”, does not include a specific NumPy version) and find a more appropriate mechanism for enforcing policy preference in your wheel distribution workflow. Maybe the Python build ecosystem needs a build-system.preferred-versions key or something to automate this, but hijacking build-system.requires is not the answer.

As for users who “trust [your] [judgment] or don’t even know what is going on”, they should by using the wheels you distribute. By their very nature, anybody who is building a custom Python package from your source distribution knows what is going on and has a legitimate interest in not requiring an arbitrarily specific version of NumPy to do so.