scipy: BUG: Missing symbol when used with reference LAPACK: cblas_cdotc_sub

I’ve been testing scipy against all the BLAS/LAPACK variants available in conda-forge for a while, and though there are small issues here and there, generally things work everywhere.

Since scipy 1.10.1 (or some other change in the last 3 months), there’s however now a kind of error I haven’t seen before - a missing symbol (`cblas_cdotc_sub), but only for one BLAS/LAPACK flavour, namely the “netlib” one from https://github.com/Reference-LAPACK/lapack/.

This happens for linux on all arches (x64, aarch64, ppc64le), but not on osx/win.

Since this is causing an error directly upon import (rather than a small handful of failed tests), this has a much bigger blast radius.

import: 'scipy.cluster'
Traceback (most recent call last):
  [...]
    from . import _iterative
ImportError: $PREFIX/lib/python3.10/site-packages/scipy/sparse/linalg/_isolve/_iterative.cpython-310-x86_64-linux-gnu.so: undefined symbol: cblas_cdotc_sub

Apparently this is encountered in the wild already: https://github.com/python-control/Slycot/pull/194

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 20 (15 by maintainers)

Commits related to this issue

Most upvoted comments

I can reproduce this in an environment with the following changes:

diff --git a/environment.yml b/environment.yml
index 6eb85bbcc..00b0141c7 100644
--- a/environment.yml
+++ b/environment.yml
@@ -3,7 +3,7 @@
 #   $ conda activate scipy-dev
 #
 # Also used to build the `scipy-dev` Docker image via GitHub Actions
-name: scipy-dev
+name: scipy-dev-netlib
 channels:
   - conda-forge
 dependencies:
@@ -15,9 +15,11 @@ dependencies:
   - meson-python
   - ninja
   - numpy
-  - openblas
+  - libblas
+  - libcblas
+  - liblapack
+  - blas-devel
   - pkg-config  # note: not available on Windows
-  - libblas=*=*openblas  # helps avoid pulling in MKL
+  - libblas=*=*netlib  # helps avoid pulling in MKL
   - pybind11
   # scipy.datasets dependency
   - pooch

And then running python dev.py build -C-Dblas=blas -C-Dlapack=lapack -C-Duse-g77-abi=true (after fixing another regression).

The issue is that Netlib BLAS has separate libblas and libcblas libraries, and we’re only linking against libblas. We are using CBLAS explicitly - but only when building with -Duse-g77-abi=true, that’s why this doesn’t show up in CI. So the combination of building against Netlib BLAS and using those g77 ABI wrappers is where things go wrong:

$ objdump -T build/scipy/sparse/linalg/_isolve/_iterative.cpython-310-x86_64-linux-gnu.so | rg cblas_cdot
0000000000000000      D  *UND*  0000000000000000  Base        cblas_cdotc_sub
$ ldd build/scipy/sparse/linalg/_isolve/_iterative.cpython-310-x86_64-linux-gnu.so | rg blas
        libblas.so.3 => /home/rgommers/mambaforge/envs/scipy-dev-netlib/lib/libblas.so.3 (0x00007fca399a3000)
$ objdump -T ~/mambaforge/envs/scipy-dev-netlib/lib/libblas.so | rg cblas_cdot
$ objdump -T ~/mambaforge/envs/scipy-dev-netlib/lib/libcblas.so | rg cblas_cdot
000000000000c590 g    DF .text  000000000000002b  Base        cblas_cdotu_sub
000000000000c5c0 g    DF .text  000000000000002b  Base        cblas_cdotc_sub

A fix is on the way. Our CI situation for BLAS/LAPACK libraries is very bad, that’s why this snuck in. I’ll see if I can address that as well.

This cdotc usage will be gone end of this week when #18391. It doesn’t solve the issue but will eliminate its impact radius

If the linux + netlib jobs pass the import tests and run the test suite, then the feedstock changes in conda-forge/scipy-feedstock#231 are at fault (somehow).

Reporting back here, the distutils based builds against netlibs on aarch/ppc were successful in https://github.com/conda-forge/scipy-feedstock/pull/224, so it appears to be some consequence of switching to meson.

It’s late for me so I won’t be able to report the result right away, but I’ve restarted a test against all blas variants in https://github.com/conda-forge/scipy-feedstock/pull/224 when built with distutils rather than meson (if nothing else, then to rule out that this might have something to do with it).

If the linux + netlib jobs pass the import tests and run the test suite, then the feedstock changes in https://github.com/conda-forge/scipy-feedstock/pull/231 are at fault (somehow).

Yes, for Linux and macOS, the conda-forge 1.10.0 binaries were built with setup.py, the latest 1.10.1 ones with Meson.

It looks like 3 weeks ago our CI was running with 1.10.1 binaries from conda forge. So maybe this change which was 2 weeks ago has some connection: https://github.com/conda-forge/scipy-feedstock/pull/231