matrixmultiply: Test involving many 6x6 matrices fails randomly on Mac OS

As part of a bigger project, I have some test code that multiplies matrices of various sizes over and over. One of the tests multiplies 6×6 matrices with one another over and over, about 200k times or so. Even though the test typically takes 15s to complete on other platforms, on Mac OS specifically, about half of the time, it panics on the following line at 2s: https://github.com/bluss/matrixmultiply/blob/9e4a11fba28d97d721433c32d517802aa19507cb/src/gemm.rs#L294 Here’s a link to the backtrace in one specific run where it failed.

I’m using nalgebra as a dependency, but due to the OS specific nature of the bug, I’ve discarded it being a bug over there. I haven’t been able to isolate exactly how to reproduce this bug, but hopefully the backtrace helps to that avail.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

Oh great. If we can reproduce it I’m sure we can fix it. I don’t have access to any macs right now. Contributors adding macos support to CI are welcome by the way.

I’ll ask on the Rust forum. https://users.rust-lang.org/t/macos-tls-and-alignments/58032

I already asked on the nalgebra Discord, but the lead dev promptly redirected me here, given that this is an OS-specific bug, and nalgebra itself contains no OS-specific functionality other than that given by the dependencies.

Weirdly, the code in the repo where the bug originated doesn’t seem to fail anymore when the --release flag is specified on the tests. At the very least, the last two tests have been successful. This might be a fluke, so I’ll keep testing.