scipy: Memory Allocation Fault in LU factorization

This issue was encountered here in Scikit. In short, when trying to perform linear dimensionality reduction for data analysis purposes, the script fails abruptly with a memory allocation error.

After some debugging, I managed to pin down the possible source of the problem to the LU factorization on line line 258 in utils/extmath.py. The fault occurs inside the decomposition in linalg.lu (specifically in line 185 of scipy/linalg/decomp_lu.py): the execution flow passes through the f2py wrapper of dlu_c and fails somewhere inside dgetrf (line 27).

Below is a simple reproducing snippet (on OS X 10.10.5), although LU-decomposing a matrix such as this makes little sense:

import numpy as np

from scipy import linalg

# changing 46336 to 46335 avoids the fault
obj = np.ones((46336, 110), dtype=np.float)
Q, _ = linalg.lu(obj, permute_l=True)

In python 3.6.0 (and in 2.7.13) it fails with

python(723,0x7fff77404300) malloc: *** mach_vm_map(size=18446744065123717120) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

The value 18446744065123717120 is FFFF FFFE 003E 9000 in hex looks like a signed extension of an unsigned int.

I used the following setup: python 3.6.0 (default, Dec 24 2016, 08:02:28) (in virtualenv), default numpy (‘1.12.0’) from pip. I tested both the latest build from git and the pip version (‘0.18.1’) of scipy. System information:

System Version: OS X 10.10.5 (14F2109)
Kernel Version: Darwin 14.5.0

No manual linkage to specific linear algebra libraries was done, since according to the reference numpy/scipy automatically links with the Apple’s Accelerate/vecLib linear algebra libraries.

Judging from the crash report form python 2.7.13 (crash_report.txt), something odd is going on inside Apple’s Accelerate.

Python 2.7.13 (default, Dec 18 2016, 07:03:34) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin

cc: @lesteve

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 110 (89 by maintainers)

Most upvoted comments

Given the divergence described with Accelerate, a simpler option might be for the Apple Developer Ecosystem Engineering team to provide patches to optimize OpenBLAS. This would parallel their support of open source projects like Embree and Clang/LLVM. Support could be added piecemeal and tested against the existing functional but not optimized code. I do think direct help from Apple would help, though some have begun to reverse engineer the AMX instructions.

The only reason I re-ignited this here is that the specific issue raised by this issue seems to be resolved with the M1 (or perhaps the latest versions of accelerate). Naively, I thought this was one of the few blockers to using the accelerate framework. There are clearly a lot more issues that I was not aware of. This is well outside my expertise. I do appreciate the insightful and rapid feedback from SciPy developers. This suggests a vibrant community.

My sense is that this question dies here unless someone closely involved with either SciPy or Apple takes the lead. If anyone fits that criteria, I encourage them to start a new issue.

By the way should we get to a brand new issue instead of hiding inside an old linalg issue? On one hand I like that it is hidden here since it gets salty very quickly when a brand is discussed, but also it might need more eyes on it.

Indeed, the bug seems to be fixed in High Sierra 😃

I just upgraded to macOS High Sierra (10.13) and the bug seems to have been fixed by Apple. All examples above (both Python and C++) work! 🎉

Sorry I’m boarding a plane in an hour and will probably not have time for more than drive-by commenting in the next few weeks. But I think if you can work things out as far as possible along these lines, then some of the people on gh-6051 will be happy to fill in the remaining blanks.

@rgommers @cbrnr Let me try.

I’m hopeful that once the discussion in #6051 is finalized, we will be back on track with the freedom of choosing BLAS or LAPACK flavor. But it seems that we are stuck until v1.0 with Accelerate