cuml: [BUG] libcudart.so: cannot open shared object file: No such file or directory

Describe the bug I installed cuml and found it throws error in running:

Steps/Code to reproduce bug

import os
import pandas as pd

path_dataset = 'Dataset'
df_all = pd.read_json(os.path.join(path_dataset, 'filtered.json'))

from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

docs = df_all['Challenge_summary'].tolist()
embeddings = embedding_model.encode(docs)

import matplotlib.pyplot as plt
import cuml
model = cuml.TSNE(n_neighbors=32)
embed2D = model.fit_transform(embeddings)
train['x'] = embed2D[:,0]
train['y'] = embed2D[:,1]
fig = plt.figure(figsize=(1000,1000))
plt.scatter(train.x,train.y,color='blue',s=10,label='Clusters')
fig.savefig('test.png')

Expected behavior It runs successfully.

Environment details (please complete the following information):

Environment location: Bare-metal
Linux Distro/Architecture: Linux docjk-gpu-01 5.15.0-67-generic #74-Ubuntu SMP Wed Feb 22 14:14:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
GPU Model/Driver: A100 and 525.85.12
CUDA: 12.0
Method of cuDF & cuML install:

pip install cudf-cu11 dask-cudf-cu11 --extra-index-url=https://pypi.nvidia.com
pip install cuml-cu11 --extra-index-url=https://pypi.nvidia.com
pip install cugraph-cu11 --extra-index-url=https://pypi.nvidia.com

according to https://docs.rapids.ai/install#pip

Additional context Error trace:

(.venv) 21zz42@docjk-gpu-01:~/Asset-Management-Topic-Modeling$ python "Code/best_challenge copy.py"
Traceback (most recent call last):
  File "/home/21zz42/Asset-Management-Topic-Modeling/Code/best_challenge copy.py", line 52, in <module>
    import cuml
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/__init__.py", line 17, in <module>
    from cuml.internals.base import Base, UniversalBase
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/__init__.py", line 17, in <module>
    from cuml.internals.base_helpers import (
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/base_helpers.py", line 20, in <module>
    from cuml.internals.api_decorators import (
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/api_decorators.py", line 24, in <module>
    from cuml.internals import input_utils as iu
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/input_utils.py", line 19, in <module>
    from cuml.internals.array import CumlArray
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/array.py", line 22, in <module>
    from cuml.internals.global_settings import GlobalSettings
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/global_settings.py", line 20, in <module>
    from cuml.internals.device_type import DeviceType
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/device_type.py", line 19, in <module>
    from cuml.internals.mem_type import MemoryType
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/mem_type.py", line 25, in <module>
    cudf = gpu_only_import('cudf')
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cuml/internals/safe_imports.py", line 366, in gpu_only_import
    return importlib.import_module(module)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cudf/__init__.py", line 5, in <module>
    validate_setup()
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/cudf/utils/gpu_utils.py", line 95, in validate_setup
    cuda_runtime_version = runtimeGetVersion()
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/rmm/_cuda/gpu.py", line 87, in runtimeGetVersion
    major, minor = numba.cuda.runtime.get_version()
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 111, in get_version
    self.cudaRuntimeGetVersion(ctypes.byref(rtver))
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 65, in __getattr__
    self._initialize()
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 51, in _initialize
    self.lib = open_cudalib('cudart')
  File "/home/21zz42/Asset-Management-Topic-Modeling/.venv/lib/python3.10/site-packages/numba/cuda/cudadrv/libs.py", line 60, in open_cudalib
    return ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so: cannot open shared object file: No such file or directory

https://stackoverflow.com/questions/69934320/oserror-libcudart-so-10-2-cannot-open-shared-object-file-no-such-file-or-dire does not work for me since I could run Pytorch successfully.

About this issue

Original URL
State: open
Created a year ago
Reactions: 3
Comments: 20 (7 by maintainers)

Most upvoted comments

Just adding another data point, and posting a thanks to developers for their work on this. Currently, the installation guide (https://docs.rapids.ai/install#pip) claims support for CUDA 12 with pip. I am running CUDA 12.0. My cuml installation was successful with pip (pip install cudf-cu12 cuml-cu12 --extra-index-url=https://pypi.nvidia.com). But I get the same libcudart.so error when I try to train a model.

mike@henry:~$ nvidia-smi
Thu Jul 27 17:42:47 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A300...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   56C    P8    17W / 115W |    865MiB /  6144MiB |     25%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2044      G   /usr/lib/xorg/Xorg                362MiB |
|    0   N/A  N/A      2527      G   /usr/bin/gnome-shell              142MiB |
|    0   N/A  N/A      3481      G   ...veSuggestionsOnlyOnDemand       82MiB |
|    0   N/A  N/A      8067      G   ...8/usr/lib/firefox/firefox      183MiB |
|    0   N/A  N/A     37940      G   ...RendererForSitePerProcess       35MiB |
+-----------------------------------------------------------------------------+

mfschmidt on Jul 27, 2023