OpenBLAS: openblas build fails using Cray compiler cce

openblas@0.3.20 build fails using Cray compiler on OLCF Crusher.

I think this issue extends beyond Crusher, wherever Cray compilers are used to build openblas.

Spack compiler definition for cce:

compilers:
  - compiler:
      spec: cce@14.0.0
      paths:
        cc: /opt/cray/pe/craype/2.7.13/bin/cc
        cxx: /opt/cray/pe/craype/2.7.13/bin/CC
        f77: /opt/cray/pe/craype/2.7.13/bin/ftn
        fc: /opt/cray/pe/craype/2.7.13/bin/ftn
      flags: {}
      operating_system: sles15
      target: any
      modules:
      - PrgEnv-cray/8.3.3
      - cce/14.0.0
      - craype-x86-trento
      - libfabric
      - cray-pmi/6.1.2

Build error:

$> spack install openblas %cce@14
...
==> Installing openblas-0.3.20-wcx5q4myxg5skj63wzuf24bn2rhyfac4
==> No binary for openblas-0.3.20-wcx5q4myxg5skj63wzuf24bn2rhyfac4 found: installing from source
==> Fetching https://mirror.spack.io/_source-cache/archive/84/8495c9affc536253648e942908e88e097f2ec7753ede55aca52e5dead3029e3c.tar.gz
==> No patches needed for openblas
==> openblas: Executing phase: 'edit'
==> openblas: Executing phase: 'build'
==> Error: ProcessError: Command exited with status 2:
    'make' '-j16' 'CC=/gpfs/alpine/csc439/world-shared/E4S/ParaTools/22.05/PrgEnv-cray/spack/lib/spack/env/cce/cc' 'FC=/gpfs/alpine/csc439/world-shared/E4S/ParaTools/22.05/PrgEnv-cray/spack/lib/spack/env/cce/ftn' 'MAKE_NB_JOBS=0' 'ARCH=x86_64' 'TARGET=ZEN' 'USE_LOCKING=1' 'USE_OPENMP=1' 'USE_THREAD=1' 'RANLIB=ranlib' 'libs' 'netlib' 'shared'

18 errors found in build log:
     2879    ftn-2103 ftn: WARNING in command line
     2880      The -W all option is not supported or invalid and will be ignored.
     2881    ftn-2307 ftn: ERROR in command line
     2882      The "-m" option must be followed by 0, 1, 2, 3 or 4.
     2883    ftn-2307 ftn: ERROR in command line
     2884      The "-m" option must be followed by 0, 1, 2, 3 or 4.
  >> 2885    make[2]: *** [<builtin>: spotrf2.o] Error 1
     2886    make[2]: *** Waiting for unfinished jobs....
  >> 2887    make[2]: *** [<builtin>: sgecon.o] Error 1
  >> 2888    make[2]: *** [<builtin>: sgebak.o] Error 1
  >> 2889    make[2]: *** [<builtin>: sgbsvx.o] Error 1
  >> 2890    make[2]: *** [<builtin>: sgetrf2.o] Error 1
  >> 2891    make[2]: *** [<builtin>: sgbbrd.o] Error 1
  >> 2892    make[2]: *** [<builtin>: sgebal.o] Error 1
  >> 2893    make[2]: *** [<builtin>: sgbcon.o] Error 1
  >> 2894    make[2]: *** [<builtin>: sgbtrs.o] Error 1
  >> 2895    make[2]: *** [<builtin>: sgbrfs.o] Error 1
  >> 2896    make[2]: *** [<builtin>: sgebrd.o] Error 1
  >> 2897    make[2]: *** [<builtin>: sgebd2.o] Error 1
  >> 2898    make[2]: *** [<builtin>: sgbtrf.o] Error 1
  >> 2899    make[2]: *** [<builtin>: sgbequ.o] Error 1
  >> 2900    make[2]: *** [<builtin>: sgbtf2.o] Error 1
  >> 2901    make[2]: *** [<builtin>: sgbsv.o] Error 1
     2902    make[2]: Leaving directory '/tmp/eugenewalker/spack-stage/spack-stage-openblas-0.3.20-wcx5q4myxg5skj63wzuf24bn2rhyfac4/spack-src/lapack-netlib/SRC'
  >> 2903    make[1]: *** [Makefile:27: lapacklib] Error 2
     2904    make[1]: Leaving directory '/tmp/eugenewalker/spack-stage/spack-stage-openblas-0.3.20-wcx5q4myxg5skj63wzuf24bn2rhyfac4/spack-src/lapack-netlib'
  >> 2905    make: *** [Makefile:250: netlib] Error 2

FYI https://github.com/spack/spack/issues/30388

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 79

Most upvoted comments

@martin-frbg, @becker33, the problem looks to be that crayftn is pattern matching to libsci routines (even without the libsci module loaded). To turn this feature off, just add -hnopattern to the fortran compiler flags. With @martin-frbg changes to the build system (mentioned above) the following seems to work.

'make' '-j16' \
FFLAGS='-hnopattern' \
CC=/autofs/nccs-svm1_proj/ven114/lukebr/spack/lib/spack/env/cce/cc' \
'FC=/autofs/nccs-svm1_proj/ven114/lukebr/spack/lib/spack/env/cce/ftn' \
'MAKE_NB_JOBS=0' 'ARCH=x86_64' 'TARGET=ZEN' 'USE_LOCKING=1' \
'USE_OPENMP=0' 'USE_THREAD=0' 'RANLIB=ranlib' 'libs' 'netlib' 'shared'

Sorry for the messy editing. Makefile.system now replaced again. For backporting to 0.3.20, it should be safe to use the f_check.pl as a direct replacement for the 0.3.20 f_check. And 0.3.21 will hopefully be out in a few days. Thank you for testing.

Now (maybe…)

new Makefile.system with the -hnopattern integrated Makefile.system.txt

Thank you very much. So calling the wrappers actually works as long as the -hnopattern option is present, there is no need to call the craycc/crayftn executables directly ? (I’m a bit worried that it could be linking against your libsci again)

Can you try with the replacement KERNEL.ZEN file please ?

This may be as simple as adding “ftn” to the list of executable names at the beginning of f_check to avoid the early fallback to assuming gfortran. However i need to see the assembler file that is produced by ftn -S ftest.f for identifying strings the Cray compiler puts in its output. (And there may be more obstacles after we get the basic recognition going - for instance we need to figure out the interfacing convention for mixed C/Fortran code)

I’ll see what I can come up with tomorrow (past 11pm here now).