OpenBLAS: BUG: Regression in DGESDD / DLASCL on macOS Vortex M1
On 8c20ca345aad43c2f74a72b356afdbc1ec368e31 using OpenBLAS to build and compile SciPy I get problems when I try to take use DGESDD (or DGESVD) to take the SVD of a matrix of ones of shape (41, 50):
OpenBLAS larsoner$ make clean; git reset --hard; git clean -xdf; MACOSX_DEPLOYMENT_TARGET=11.0 make && make install
scipy larsoner$ git clean -xdf && ln -s ../site.cfg && python setup.py develop && python -c "import numpy as np, scipy.linalg; scipy.linalg.svd(np.ones((41, 50)))"
** On entry to DLASCL parameter number 4 had an illegal value
** On entry to DLASCL parameter number 4 had an illegal value
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/larsoner/python/scipy/scipy/linalg/_decomp_svd.py", line 131, in svd
raise LinAlgError("SVD did not converge")
numpy.linalg.LinAlgError: SVD did not converge
On the previous commit 8e4c209002d1a648094a1ac5ca02efbae69d253f, things seem to be okay (SVD decomp matches that on x86_64 Linux). Same thing with being on latest develop / 7060ca50020e990f2e4cc7c3f340fcf1a887af7d and doing git revert 8c20ca3 to revert just one of the two commits (8c20ca3) from #3399 things are okay.
@martin-frbg looks like https://github.com/xianyi/OpenBLAS/pull/3399 this was your PR, can you replicate?
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (9 by maintainers)
I commented this over in https://github.com/xianyi/OpenBLAS/pull/3657#issuecomment-1159748441 , but just to have all the testing in one place, this KERNEL.VORTEX also appears to be okay, as suggested by @martin-frbg :
Okay so maybe until someone can fix the thunderx2t99 code to work on M1 / VORTEX we should just switch. +1 for putting this change into #3657, especially if your test case in that PR does turn things red on CIs so we can see it switch to green!
So @martin-frbg for now do you suggest that we put in this little workaround? It looks like you’re already working on something in #3657 so happy to let you move forward however you think is best…
Oh sh… sorry for that. Reproducible now on M1/Linux. Slightly improved solution is to set the DNRM2 (and SNRM2) kernel to “nrm2.S” (as used by most arm64 targets - and corresponding znrm2.S for CNRM2/ZNRM2 ). ThunderX2 kernel must be taking a shortcut or missing an initilalization somewhere that does not show up in the regular BLAS/LAPACK tests. Maybe even something introduced by its most recent fix to handle Inf in input.
If you can point me to a tutorial or step-by-step for how to do this I’m happy to. But I’ll start with trying to follow @martin-frbg 's suggestion since I think it covers what you suggest as well …
Okay so digging into
kernels/arm64/VORTEXwith what is indevelop, and just to be complete – using the following:FAILS:
SUCCEEDS (n.b. the same as
git revert 8c20ca3):SUCCEEDS:
FAILS (makes sense because this is a double computation)
SUCCEEDS:
So it seems to be the
dznrm2_thunderx2t99.c(included by KERNEL.NEOVERSEN1) that is problematic!I’m happy to debug further if someone can give me some instructions – I’m barely a C programmer nowadays so I’m hopeless in assembly 😦
So another way to go here is to make it clear what code path NumPy is using that hits the bug. But before that, @martin-frbg I took another look at the pure-fortran example and noticed what I think is an error – the second call was to SGESDD rather than DGESDD. When I change it to use DGESDD, I can replicate the error:
@martin-frbg maybe you can replicate locally now, too? Here is the modified file (but the diff is just the one line changed):
gesdd.f.txt