scipy: BUG: Seg. fault in scipy.sparse.linalg tests in PROPACK

Describe your issue.

The test scipy.sparse.linalg.tests.test_propack::test_svdp with parameters which='LM', irl=True, precision='complex8' and ctor=np.array triggers a seg. fault.

E.g. (after installing the master branch with pip install .):

$ python runtests.py -n -m full -t scipy.sparse.linalg.tests.test_propack::test_svdp -v
================================================================ test session starts =================================================================
platform linux -- Python 3.9.5, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /home/warren/mc39scipy/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/warren/repos/git/forks/scipy/build/test/.hypothesis/examples')
rootdir: /home/warren/repos/git/forks/scipy, configfile: pytest.ini
plugins: hypothesis-6.14.1
collected 4 items                                                                                                                                    

../../::test_svdp[LM-True-single-array] PASSED                                                                                                 [ 25%]
../../::test_svdp[LM-True-double-array] PASSED                                                                                                 [ 50%]
../../::test_svdp[LM-True-complex8-array] Fatal Python error: Segmentation fault

Current thread 0x00007facaaea9380 (most recent call first):
  File "/home/warren/mc39scipy/lib/python3.9/site-packages/scipy/sparse/linalg/_svdp.py", line 302 in _svdp
  File "/home/warren/mc39scipy/lib/python3.9/site-packages/scipy/sparse/linalg/tests/test_propack.py", line 73 in check_svdp
  File "/home/warren/mc39scipy/lib/python3.9/site-packages/scipy/sparse/linalg/tests/test_propack.py", line 112 in test_svdp
<snip>
  File "/home/warren/mc39scipy/lib/python3.9/site-packages/scipy/_lib/_testutils.py", line 67 in __call__
  File "/home/warren/repos/git/forks/scipy/runtests.py", line 317 in main
  File "/home/warren/repos/git/forks/scipy/runtests.py", line 566 in <module>
Segmentation fault (core dumped)

This short script also triggers the seg. fault:

import numpy as np
from scipy.sparse import linalg


a = np.array([[0, 1j, 0], [-1j, 0, 0], [0, 0, 2]], dtype=np.complex64)
result = linalg._svdp._svdp(a, 1)

This is on a Linux machine running PopOS 21.04

$ uname -a
Linux pop-os 5.13.0-7620-generic #20~1634827117~21.04~874b071-Ubuntu SMP Fri Oct 29 15:06:55 UTC  x86_64 x86_64 x86_64 GNU/Linux

Reproducing Code Example

See above.

Error message

See above.

SciPy/NumPy/Python version information

1.8.0.dev0+2099.3f05efb 1.21.2 sys.version_info(major=3, minor=9, micro=5, releaselevel=‘final’, serial=0)

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 74 (73 by maintainers)

Most upvoted comments

All the build system related issues and the low-hanging fruit in PROPACK itself should be fixed now. Here are the structural issues that seem to be left:

  1. Incorrect use of cdotc/zdotc wrappers
  2. Use of ?aprod as functions rather than subroutines
  3. Incorrect use of __APPLE__

In addition there are some code hygiene things that would be useful to clean up to make the code more readable:

  1. Completely duplicate files in the complex8 and complex16 submodules. E.g., dblasext.F is the same in double/ (where it belongs) and complex16/, modulo a single declaration of ZDOTC which really belongs in zblasext.F
  2. Half the code is OpenMP related, and should probably be simply deleted

Here are what it looks like needs doing for 1-3:

Incorrect use of cdotc/zdotc wrappers

The ABI wrappers like wcdotc that we have in scipy/_build_utils/src/wrap_g77_abi* should be used the way is done in ARPACK and other libraries, rather than being defined locally in PROPACK files. Which means modifying PROPACK sources to always use wcdotc rather than cdotc, instead of redefining wrappers like this (from cblasext.F):

    #ifdef SCIPY_USE_G77_CDOTC_WRAP
          complex function cdotc(n, cx, incx, cy, incy)
          complex cx(*), cy(*), c
          integer n, incx, incy
          call WCDOTC(n, cx, incx, cy, incy)
          cdotc = c
          return
          end

    #endif

Also, it seems that the return value of WCDOTC is never captured and PROPACK’s wrapper will simply return whatever garbage is in c. And it looks like WCDOTC() is being used as a subroutine instead of a function, like it’s declared.

Use of ?aprod as functions rather than subroutines

This needs code review from someone who understands the f2py signature files. There are warnings when ?aprod is being used. Each of the ?propack.pyf files are declaring ?aprod as functions, however, the intention seems to be they’re subroutines - not actually setting a return value, and ?lanvd, ?lanbpro, etc. are using them as subroutines.

Incorrect use of __APPLE__

The use of # ifndef __APPLE__ seems to be based on an implicit assumption about either Intel or PowerPC architecture about size of types, or a particular BLAS library. This kind of thing must be wrong:

      complex*16 function pzdotu(n, x , incx, y, incy)
      implicit none
      integer n, incx, incy
      complex*16 x(*),y(*)
#ifndef __APPLE__
      complex*16 zdotu
      external zdotu
      
      pzdotu = zdotu(n, x , incx, y, incy)
#else
      integer i
      pzdotu = (0.0,0.0)
      if (incx.eq.1 .and. incy.eq.1) then
         do i=1,n
            pzdotu = pzdotu + x(i)*y(i)
         enddo
      else
         do i=1,n
            pzdotu = pzdotu + x(1+(i-1)*incx)*y(1+(i-1)*incy)
         enddo
      endif
#endif

I honestly don’t know what is going on there, but the use of __APPLE__ must be removed, this code makes no sense whatsoever.

@mckib2 @rgommers @tylerjereddy

Amazing news, PROPACK tests for 1.10.0 pass on windows with all blas flavours! 🤩

Ah, I thought that was this issue

@h-vetinari would you be able to submit a PR to mark those as xfail in main, and mark it as backport-candidate?

Normally this is what I would have done (c.f. #16512, which had a somewhat narrower scope of complex128), but I tend to think the “surface” of segfaults is still too large to just mark these as expected failures and switch on propack support unconditionally (which is the only point where this matters AFAICT, as the test suite without set SCIPY_USE_PROPACK=1 is green as of rc2).

I started another run in https://github.com/conda-forge/scipy-feedstock/pull/200, and windows fails for all BLAS flavours (not just MKL), while unix passes for all.

For reference, conda list should look something like:

[...]
python                    3.9.9        hcf16a7b_0_cpython    conda-forge
python_abi                3.9                      2_cp39    conda-forge
scipy                     1.8.0rc2         py39ha366c82_0    file:///C:/Users/[xxx]/Downloads/scipy-artefact
setuptools                60.2.0           py39hcbf5309_0    conda-forge
[...]

With the latest artefacts, I can reproduce the segfault (and the error in TestFBLAS2Simple.test_syr_her that you encountered, which was already there in 1.7.3).

Yep – this was the issue. conda list reports scipy 1.7.3. I’ll try running with artifact links you posted above

EDIT: success! I have reproduced the crash, hopefully have some time this evening to start looking into it, thanks @h-vetinari!

Fingers crossed, I think I have this resolved here by adding a couple wrappers to the PROPACK module itself: gh-15249