OpenBLAS: GEMM numerical errors with complex single-precision (cgemm) on Apple M2

On Apple M2 it seems there is a numerical error when using CGEMM, so just with single precision complex type. Actually, we found problems with CHER2K as well, but our speculation is that it probably shares code with GEMM.

Hereafter, a small reproducer and an excerpt of internal OpenBLAS tests where it is reported CGEMM failure.

Reproducer

Here a reproducer of the GEMM problem.

#include <complex>
#include <iostream>

extern "C" {
  void zgemm_(const char*, const char*,
      int*, int*, int*,
      std::complex<double>*,
      std::complex<double>*, int*,
      std::complex<double>*, int*,
      std::complex<double>*,
      std::complex<double>*, int*);

  void cgemm_(const char*, const char*,
      int*, int*, int*,
      std::complex<float>*,
      std::complex<float>*, int*,
      std::complex<float>*, int*,
      std::complex<float>*,
      std::complex<float>*, int*);
}

template <class T>
using gemm_func_t = void (*)(const char*, const char*,
    int*, int*, int*,
    std::complex<T>*,
    std::complex<T>*, int*,
    std::complex<T>*, int*,
    std::complex<T>*,
    std::complex<T>*, int*);

template <class T>
void test_gemm(gemm_func_t<T> gemm) {
  std::complex<T> a(1, 1);
  std::complex<T> b(1, 1);
  std::complex<T> c(1, 1);

  int m = 1, n = 1, k = 1;
  int ld = 1;

  std::complex<T> alpha{0.0, 1.0}, beta{0.0, 0.0};

  gemm("N", "N",
      &m, &n, &k,
      &alpha,
      &a, &ld,
      &b, &ld,
      &beta,
      &c, &ld);

  std::cout << c << "\n";
}

int main() {
  test_gemm<float>(&cgemm_);
  test_gemm<double>(&zgemm_);
}

Whose output on an Apple M2 is

$ ./example
(0,0)
(-2,0)

when it is supposed to give exactly the same result, i.e. (-2, 0).

OpenBLAS test

Here an excerpt of the output of OpenBLAS tests, where a failure related to CGEMM is reported (it was the only failure).

 ******* FATAL ERROR - COMPUTED RESULT IS LESS THAN HALF ACCURATE *******
                       EXPECTED RESULT                    COMPUTED RESULT
       1  (  -0.716881    ,  -0.369739    )  (  -0.716881    ,  -0.369739    )
       2  (   0.418750    ,    1.08548    )  (   0.418750    ,    1.08548    )
       3  (   0.211250    ,   -1.68483    )  (    1.21094    ,  -0.470845    )
 ******* CGEMM  FAILED ON CALL NUMBER:
   9674: CGEMM ('T','N',  3,  1, 31,( 0.7,-0.9), A, 32, B, 32,( 1.0, 0.0), C,  4).

Thanks to @rasolca for helping investigating this.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 19

Most upvoted comments

Thanks for the feedback, I have merged #4003 now. (BTW the SciPy folks are also seeing some issues with their testsuite and unpatched 0.3.23 - not clear yet if it is the exact same problem or perhaps tiny FMA-related inaccuracies on the new hardware)