ROCm: `/etc/OpenCL/vendors/amdocl64.icd` doesn't use absolute paths and/or not in a default LD_LIBRARY_PATH

Something does install /etc/OpenCL/vendors/amdocl64.icd but it isn’t in any package, so I suspect this is put there by some post-install script, which feels unnecessary.

Also it simply doesn’t work because it is only using the name of the .so file without the path:

$ cat /etc/OpenCL/vendors/amdocl64.icd 
libamdocl64.so
$

So by default ocl-icd-libopencl1 (providing /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0) will not be able to find it and simply ignore it.

Tested with clinfo version 2.2.18.04.06-1 (https://github.com/Oblomov/clinfo) and ocl-icd-libopencl1 version 2.2.12-4 ( https://forge.imag.fr/projects/ocl-icd/ ) generic loader.

$ clinfo | egrep -i 'Parallel|HSA'
$

Putting absolute path into the icd defintion:

$ cat /etc/OpenCL/vendors/amdocl64.icd
/opt/rocm-3.5.0/opencl/lib/libamdocl64.so
$

solves it:

$ clinfo | egrep -i 'Parallel|HSA'
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Name                                   AMD Accelerated Parallel Processing
  Driver Version                                  3137.0 (HSA1.1,LC)
$

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 8
  • Comments: 15

Most upvoted comments

Still broken in ROCm 3.9.0.

I believe I found the issue.

the postinst script of the rocm-opencl3.9.0 uses wrong paths:

do_ldconfig() {
  if [ -e "/opt/rocm/opencl" ] ; then
    echo /opt/rocm/opencl/lib > /etc/ld.so.conf.d/x86_64-rocm-opencl.conf && ldconfig
  fi
  mkdir -p /etc/OpenCL/vendors && (echo libamdocl64.so > /etc/OpenCL/vendors/amdocl64_30900.icd)
}

INSTALL_PATH=/opt/rocm-3.9.0/opencl
ROCM_LIBPATH=/opt/rocm-3.9.0/lib

case "$1" in
  abort-deconfigure|abort-remove|abort-upgrade)
    echo "$1"
  ;;
  configure)
    mkdir -p ${ROCM_LIBPATH}
    ln -s -f -r ${INSTALL_PATH}/lib/libOpenCL.so ${ROCM_LIBPATH}/libOpenCL.so
    ln -s -f -r ${INSTALL_PATH}/lib/libOpenCL.so.1 ${ROCM_LIBPATH}/libOpenCL.so.1
    ln -s -f -r ${INSTALL_PATH}/lib/libOpenCL.so.1.2 ${ROCM_LIBPATH}/libOpenCL.so.1.2
    do_ldconfig
  ;;

As you can see the do_ldconfig function uses wrong paths, and the file in ld.so.conf.d is not versioned properly.

The issue is, that it is not really possible to have two different versions installed and both present in ld.so.conf.d. It must be one.

This should probably be managed using debian alternative mechanism.

I am just asking to do this so it is easier to install and use ROCm, needing to modify LD_LIBRARY_PATH and PATH for every user or even modifying it system wide is a pain.

With all paths encoded properly one would not need to mess with this, and it will work out of the box after installing deb packages.

Multiple versions of ROCm installed at the same time can be handled in Debian and Ubuntu using update-alternatives(1) mechanism, which will still make it all work out of the box nicely.

In my installation there are hsa files in /etc/ld.so.conf.d/, but non of them has /opt/rocm-3.5.0/opencl/lib/ path, so ld can’t actually find it:

$ ldconfig -p | grep -i OpenCL
	libOpenCL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenCL.so.1
	libOpenCL.so.1 (libc6) => /lib/i386-linux-gnu/libOpenCL.so.1
	libOpenCL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenCL.so
	libMesaOpenCL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libMesaOpenCL.so.1
	libMesaOpenCL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libMesaOpenCL.so
	libBullet3OpenCL_clew.so.2.88 (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew.so.2.88
	libBullet3OpenCL_clew.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew.so
	libBullet3OpenCL_clew-float64.so.2.88 (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew-float64.so.2.88
	libBullet3OpenCL_clew-float64.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew-float64.so
$

Also there are wrong paths there:

root$ ldconfig -v 2>&1 | grep 'rocm'  | grep 'No such file or directory'
ldconfig: Can't stat /opt/rocm/hsa/lib: No such file or directory
root$

Here are the files:

root$ grep . /etc/ld.so.conf.d/hsa-*
/etc/ld.so.conf.d/hsa-ext-rocr-dev.conf:/opt/rocm/hsa/lib
/etc/ld.so.conf.d/hsa-rocr-dev.conf:/opt/rocm-3.5.0/hsa/lib
root$

Adding it one to point to the opencl and reruning ldconfig, it does start to work:

root$ echo "/opt/rocm-3.5.0/opencl/lib" > /etc/ld.so.conf.d/rocm-opencl.conf
root$ ldconfig
root$
$ ldconfig -p | grep OpenCL
	libamdocl64.so (libc6,x86-64) => /opt/rocm-3.5.0/opencl/lib/libamdocl64.so
	libOpenCL.so.1 (libc6,x86-64) => /opt/rocm-3.5.0/opencl/lib/libOpenCL.so.1
	libOpenCL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenCL.so.1
	libOpenCL.so.1 (libc6) => /lib/i386-linux-gnu/libOpenCL.so.1
	libOpenCL.so (libc6,x86-64) => /opt/rocm-3.5.0/opencl/lib/libOpenCL.so
	libOpenCL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libOpenCL.so
	libMesaOpenCL.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libMesaOpenCL.so.1
	libMesaOpenCL.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libMesaOpenCL.so
	libBullet3OpenCL_clew.so.2.88 (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew.so.2.88
	libBullet3OpenCL_clew.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew.so
	libBullet3OpenCL_clew-float64.so.2.88 (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew-float64.so.2.88
	libBullet3OpenCL_clew-float64.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libBullet3OpenCL_clew-float64.so

So, my bug is still valid icd file shoudl either use absolute path (it will work with dlopen, and I tested it), or in default LD_LIBRARY_PATH (or in ld.so.conf.d directory). But in my installation it isn’t.

$ dpkg -l | egrep 'rocm|hsa' | grep '^ii' | awk '{print $2, $3; }'
comgr 1.6.0.143-rocm-rel-3.5-30-e24e8c1
hsa-ext-rocr-dev 1.1.30500.0-rocm-rel-3.5-30-def83d8
hsa-rocr-dev 1.1.30500.0-rocm-rel-3.5-30-def83d8
hsakmt-roct 1.0.9-347-gd4b224f
rocm-clang-ocl 0.5.0.51-rocm-rel-3.5-30-74b3b81
rocm-opencl 2.0.20191
rocm-opencl-dev 2.0.20191
rocm-utils 3.5.0-30
rocminfo 1.30500.0

PS. The clinfo in rocm is a different clinfo that I use, but I use Oblomov’s clinfo, because it is in my PATH by default, I know it works fine with Nvidia, Intel and Mesa Clover and pocl OpenCL implementations, and I don’t want to loose compatibility. And I didn’t want to use clinfo bundled with rocm, because I would suspect it has own hardcoded path and/or uses own location to figure the location of the rocm OpenCL loader, which I shouldn’t too use.

It seems that the right way to fix that is to add a file to /etc/ld.so.conf.d/ which will contain a path to /opt/rocm-VERSION/opencl/lib/.

And this file should be in rocm-opencl package.