OpenBLAS: BUG: Regression in DGESDD / DLASCL on macOS Vortex M1

On 8c20ca345aad43c2f74a72b356afdbc1ec368e31 using OpenBLAS to build and compile SciPy I get problems when I try to take use DGESDD (or DGESVD) to take the SVD of a matrix of ones of shape (41, 50):

OpenBLAS larsoner$ make clean; git reset --hard; git clean -xdf; MACOSX_DEPLOYMENT_TARGET=11.0 make && make install
scipy larsoner$ git clean -xdf && ln -s ../site.cfg && python setup.py develop && python -c "import numpy as np, scipy.linalg; scipy.linalg.svd(np.ones((41, 50)))"
 ** On entry to DLASCL parameter number  4 had an illegal value
 ** On entry to DLASCL parameter number  4 had an illegal value
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/larsoner/python/scipy/scipy/linalg/_decomp_svd.py", line 131, in svd
    raise LinAlgError("SVD did not converge")
numpy.linalg.LinAlgError: SVD did not converge

On the previous commit 8e4c209002d1a648094a1ac5ca02efbae69d253f, things seem to be okay (SVD decomp matches that on x86_64 Linux). Same thing with being on latest develop / 7060ca50020e990f2e4cc7c3f340fcf1a887af7d and doing git revert 8c20ca3 to revert just one of the two commits (8c20ca3) from #3399 things are okay.

@martin-frbg looks like https://github.com/xianyi/OpenBLAS/pull/3399 this was your PR, can you replicate?

cross-ref https://github.com/numpy/numpy/issues/21756

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 18 (9 by maintainers)

Most upvoted comments

I commented this over in https://github.com/xianyi/OpenBLAS/pull/3657#issuecomment-1159748441 , but just to have all the testing in one place, this KERNEL.VORTEX also appears to be okay, as suggested by @martin-frbg :

include $(KERNELDIR)/KERNEL.NEOVERSEN1
DNRM2KERNEL = ../arm64/nrm2.S
SNRM2KERNEL = ../arm64/nrm2.S

larsoner on Jun 19, 2022

but I am still horribly out of my depth with ARM assembly)

Okay so maybe until someone can fix the thunderx2t99 code to work on M1 / VORTEX we should just switch. +1 for putting this change into #3657, especially if your test case in that PR does turn things red on CIs so we can see it switch to green!

larsoner on Jun 19, 2022

So @martin-frbg for now do you suggest that we put in this little workaround? It looks like you’re already working on something in #3657 so happy to let you move forward however you think is best…

larsoner on Jun 19, 2022

Oh sh… sorry for that. Reproducible now on M1/Linux. Slightly improved solution is to set the DNRM2 (and SNRM2) kernel to “nrm2.S” (as used by most arm64 targets - and corresponding znrm2.S for CNRM2/ZNRM2 ). ThunderX2 kernel must be taking a shortcut or missing an initilalization somewhere that does not show up in the regular BLAS/LAPACK tests. Maybe even something introduced by its most recent fix to handle Inf in input.

martin-frbg on Jun 19, 2022

there are other call chains than direct to DLASCL , so it would be worth attaching debugger breakpoint to XERBLA function and seeing which call chain is anomalous generating NaN-s. That would reduce search space below.

If you can point me to a tutorial or step-by-step for how to do this I’m happy to. But I’ll start with trying to follow @martin-frbg 's suggestion since I think it covers what you suggest as well …

change kernel/arm64/KERNEL.VORTEX to include KERNEL.ARMV8 instead…

No need to copy entire KERNEL fliles, just adding…

Okay so digging into kernels/arm64/VORTEX with what is in develop, and just to be complete – using the following:

FAILS:

include $(KERNELDIR)/KERNEL.NEOVERSEN1

SUCCEEDS (n.b. the same as git revert 8c20ca3):

include $(KERNELDIR)/KERNEL.ARMV8

SUCCEEDS:

include $(KERNELDIR)/KERNEL.NEOVERSEN1
SNRM2KERNEL = ../arm/nrm2.c
DNRM2KERNEL = ../arm/nrm2.c

FAILS (makes sense because this is a double computation)

include $(KERNELDIR)/KERNEL.NEOVERSEN1
SNRM2KERNEL = ../arm/nrm2.c

SUCCEEDS:

include $(KERNELDIR)/KERNEL.NEOVERSEN1
DNRM2KERNEL = ../arm/nrm2.c

So it seems to be the dznrm2_thunderx2t99.c (included by KERNEL.NEOVERSEN1) that is problematic!

I’m happy to debug further if someone can give me some instructions – I’m barely a C programmer nowadays so I’m hopeless in assembly 😦

If you see picture here at netlib there are other call chains than direct to DLASCL

So another way to go here is to make it clear what code path NumPy is using that hits the bug. But before that, @martin-frbg I took another look at the pure-fortran example and noticed what I think is an error – the second call was to SGESDD rather than DGESDD. When I change it to use DGESDD, I can replicate the error:

$ gfortran -ffree-form gesdd.f /opt/OpenBLAS/lib/libopenblas.a -lpthread -o gesdd
$ ./gesdd 
 ** On entry to DLASCL parameter number  4 had an illegal value
 ** On entry to DLASCL parameter number  4 had an illegal value
Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG IEEE_OVERFLOW_FLAG IEEE_UNDERFLOW_FLAG
STOP Error while computing the SVD!

@martin-frbg maybe you can replicate locally now, too? Here is the modified file (but the diff is just the one line changed):

gesdd.f.txt

larsoner on Jun 19, 2022