OpenBLAS: Q: Problems with replicating binary package perfomance

Hi, I have found some strange things while tried to compile custom opencv+dldt build with OpenBLASS GEMM.

I used precompiled libopenblas-0.2.20 that comes with Ubuntu 18.04 LTS and achieved very good results. Then I tried to repeat this result with newest OpenBLAS version (0.3.9) and even with original 0.2.20.

I tried to compile it as suggested in in User Guide (with single make) with and without gfortran. Also I tried to compile it with cmake using different flags:

First version: cmake --D BUILD_SHARED_LIBS=ON

Last version:

cmake -D BUILD_RELAPACK=ON \
            -D CMAKE_BUILD_TYPE=Release \
            -D NO_AFFINITY=ON \
            -D BUILD_SHARED_LIBS=ON \
            -D CMAKE_INSTALL_PREFIX=./OUT/ \
            -D CMAKE_INSTALL_FULL_INCLUDEDIR=./OUT/include/ \
            -D USE_THREAD=ON ..

But result libopenblas.so.0 always was like 1\2 of original binary lib size and it’s speed was 2 times slower (at best):

OpenBLAS Inference time (ms) Lib size (MB)
0.2.20 (src) 520 13.7
0.3.9 (src) 160 16.4
0.3.9 (src, RELAPACK) 140 16.1
0.3.9 (src, without gfortran) 160 3.9
0.2.20 (binary) 60 32

All experiments I made on Ubuntu 18.04 LTS multipass instance with Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz (4 physical cores and Hyper-Threading, but I gave only 6 possible threads to ubuntu multipass instance, it looks like 3 cores + HT ), gcc-7.5.0, gfortran-7.5.0

What am I doing wrong?
What are optimal parameters for maximum performance on Intel haswell x86_64 CPU? What cmake flags were set for lib from Ubuntu repository?

Please help

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (6 by maintainers)

Most upvoted comments

PS: -D NOFORTRAN=ON also shrinks size of 0.3.9 to 4.14MB without inference speed loss.

understandable as your code only uses the BLAS part of OpenBLAS, which is written in C and assembler. Fortran is required only for building the much larger set of linear algebra functions named LAPACK that sits on top of BLAS (and which OpenBLAS largely imports from the reference implementation that had its roots in the late 1980s).