tensorflow: cuDNN, cuFFT, and cuBLAS Errors

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

GIT_VERSION:v2.14.0-rc1-21-g4dacf3f368e VERSION:2.14.0

Custom code

No

OS platform and distribution

WSL2 Linux Ubuntu 22

Mobile device

No response

Python version

3.10, but I can try different versions

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

CUDA version: 11.8, cuDNN version: 8.7

GPU model and memory

NVIDIA Geforce GTX 1660 Ti, 8GB Memory

Current behavior?

When I run the GPU test from the TensorFlow install instructions, I get several errors and warnings. I don’t care about the NUMA stuff, but the first 3 errors are that TensorFlow was not able to load cuDNN. I would really like to be able to use it to speed up training some RNNs and FFNNs. I do get my GPU in the list of physical devices, so I can still train, but not as fast as with cuDNN.

Standalone code to reproduce the issue

python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

Relevant log output

2023-10-09 13:36:23.355516: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-09 13:36:23.355674: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-09 13:36:23.355933: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-10-09 13:36:23.413225: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-09 13:36:25.872586: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-09 13:36:25.916952: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-09 13:36:25.917025: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

About this issue

  • Original URL
  • State: open
  • Created 9 months ago
  • Reactions: 59
  • Comments: 124 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Hello,

I’m experiencing the same issue, even though I meticulously followed all the instructions for setting up CUDA 11.8 and CuDNN 8.7. The error messages I’m encountering are as follows:

Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered. Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered. Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered.

I’ve tried this with different versions of Python. Surprisingly, when I used Python 3.11, TensorFlow 2.13 was installed without these errors. However, when I used Python 3.10 or 3.9, I ended up with TensorFlow 2.14 and the aforementioned errors.

I’ve come across information suggesting that I may not need to manually install CUDA and CuDNN, as [and-cuda] should handle the installation of these components automatically.

Could someone please guide me on the correct approach to resolve this issue? I’ve tried various methods, but unfortunately, none of them have yielded a working solution.

P.S. I’m using conda in WSL 2 on Windows 11.

i’m dying, this issue kills my career

I also have the same issue, and this seems not to be due to cuda environment as I rebulid cuda and cudnn to make them suit for tf-2.14.0.

This is log out I find: python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

2023-10-11 18:21:57.387396: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2023-10-11 18:21:57.415774: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-10-11 18:21:57.415847: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-10-11 18:21:57.415877: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-10-11 18:21:57.421400: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-10-11 18:21:58.155058: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT 2023-10-11 18:21:59.113217: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node Your kernel may have been built without NUMA support. 2023-10-11 18:21:59.152044: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node Your kernel may have been built without NUMA support. 2023-10-11 18:21:59.152153: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:880] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node Your kernel may have been built without NUMA support. [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

@picobyte I am already using a venv. And system packages are not related to this bug 😉. Tensorflow is linking twice with the same object; not my system packages.

It turns out that bazel-out/k8-opt-exec-50AE0418/bin/tensorflow/compiler/xla/stream_executor/cuda/libcudnn_plugin.pic.lo contains the duplicated symbol(s):

(venv) daniel:/usr/src/Arch/tensorflow/tensorflow/src/tensorflow-2.14.0-opt-cuda-dbg>readelf --symbols --wide ./bazel-out/k8-opt-exec-50AE0418/bin/tensorflow/compiler/xla/stream_executor/cuda/libcudnn_plugin.pic.lo | grep google_initializer_module_register_cudnn
  4409: 0000000000000000     1 OBJECT  GLOBAL DEFAULT 3281 google_initializer_module_register_cudnn

and bazel aquery 'mnemonic("CppLink", //tensorflow:libtensorflow_cc.so.2.14.0)' --output=text shows that it linked with libtensorflow_cc.

After a whole-day-long investigation, I found the following comment in tensorflow/BUILD:

# To avoid duplication, check that the C++ or python library does not depend on
# the stream executor cuda plugins. Targets that want to use cuda APIs should
# instead depend on the dummy plugins in //tensorflow/tsl/platform/default/build_config
# and use header only targets.
# TODO(ddunleavy): This seems completely broken. :tensorflow_cc depends on
# cuda_platform from tf_additional_binary_deps and this doesn't break.
check_deps(
    name = "cuda_plugins_check_deps",
    disallowed_deps = if_static(
        [],
        otherwise = [
            "//tensorflow/compiler/xla/stream_executor/cuda:all_runtime",
            "//tensorflow/compiler/xla/stream_executor/cuda:cuda_driver",
            "//tensorflow/compiler/xla/stream_executor/cuda:cuda_platform",
            "//tensorflow/compiler/xla/stream_executor/cuda:cudnn_plugin",
            "//tensorflow/compiler/xla/stream_executor/cuda:cufft_plugin",
            "//tensorflow/compiler/xla/stream_executor:cuda_platform",
        ],
    ),
    deps = if_cuda([
        "//tensorflow:tensorflow_cc",
        "//tensorflow/python:pywrap_tensorflow_internal",
    ]),
)

So, apparently ddunleavy already knew about this bug.

@SuryanarayanaY I tried several times, reinstalling Ubuntu, but it still doesn’t work.

Spent 5 hours on this, the register factory issue was resolved with:

$ pip uninstall tensorflow $ python3 -m pip install tf-nightly[and-cuda]

But first verify that you’ve setup cuDNN and nvcc correctly first, otherwise its not tf’s problem.

Now I have the NUMA issue remaining, gonna go figure that out next.

@ymodak This piece of shit is still unresolved in the latest version. WHY and HOW could you remove the bug label !? That DOES NOT MAKE SENSE! HOW INSANE! I am confused if your test cases have covered this issue or not?

I have found a workaround that seems to be effective. By installing the following specific versions of TensorFlow and its related packages, the issue is resolved:

tensorflow==2.9.0
tensorflow-datasets==4.9.3
tensorflow-estimator==2.9.0
tensorflow-io-gcs-filesystem==0.34.0
tensorflow-metadata==1.14.0

These versions work well together and avoid the problem encountered with TensorFlow 2.14.0.

According to @Romeo-CC in https://github.com/tensorflow/tensorflow/issues/62002#issuecomment-1800718221: The issue was present in versions 2.10 and 2.11, resolved in 2.12, but reemerged in 2.14.

Hi Tensorflow maintainers, can this please be fixed? It works works with GPU for 2.13 but not for 2.15 any longer (I haven’t tried 2.14, but it others above seem to have done it):

  • Ubuntu 22.04
  • Python 3.11.7

Tensorflow==2.13.1 without any issue. (Environment: Ubuntu 22.04.2 LTS WSL2, cuda=11.8, cndnn=8.9.6)

python3.11 -m pip install tensorflow==2.13.1

I installed 2.15 with no luck, then tried 2.14.1 with errors still there.


CUDA

https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network

Installation Instructions:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb sudo dpkg -i cuda-keyring_1.0-1_all.deb sudo apt-get update sudo apt-get -y install cuda-11-8

setup your paths:

echo ‘export PATH=/usr/local/cuda-11.8/bin:$PATH’ >> ~/.bashrc echo ‘export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64:$LD_LIBRARY_PATH’ >> ~/.bashrc source ~/.bashrc sudo ldconfig


cnDNN

https://developer.nvidia.com/rdp/cudnn-download

Download cuDNN v8.9.6 (November 1st, 2023), for CUDA 11.x

wget https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.6/local_installers/11.x/cudnn-linux-x86_64-8.9.6.50_cuda11-archive.tar.xz

sudo tar -xvf cudnn-linux-x86_64-8.9.6.50_cuda11-archive.tar.xz sudo mv cudnn-linux-x86_64-8.9.6.50_cuda11-archive cuda

copy the following files into the cuda toolkit directory:

sudo cp -P cuda/include/cudnn*.h /usr/local/cuda-11.8/include sudo cp -P cuda/lib/libcudnn* /usr/local/cuda-11.8/lib64/ sudo chmod a+r /usr/local/cuda-11.8/lib64/libcudnn*

Verify Installation:

nvidia-smi nvcc -V


For NUMBA

Add these lines at the end of your .bashrc:

export LD_LIBRARY_PATH=“/usr/lib/wsl/lib/” export NUMBA_CUDA_DRIVER=“/usr/lib/wsl/lib/libcuda.so.1”

Hi @Ke293-x2Ek-Qe-7-aE-B ,

Starting from TF2.14 tensorflow provides CUDA package which can install all the cuDNN,cuFFT and cubLas libraries.

You can use pip install tensorflow[and-cuda] command for that.

Please try this command let us know if it helps. Thankyou!

It’s been five months, yet the problem remains.

Installing 2.16.0-dev20231212 with tf-nightly[and-cuda] resolved the issue for me.

Python 3.10, Ubuntu 22.04.3, WSL2, Windows 10

PS The error is still also in 2.15.

@ymodak How is it not a bug when you add the same global variable to two different shared libraries that have to be used together in most of the cases, causing the same initialization code to be run twice? This was never the intention, and is inherently wrong. The fact that it seems like there is no ill effect coming from it, because the second call will be ignored, does not convince me it should be left like this: there might be other duplicates. We’re talking clearly about the “Initialization order fiasco” here or something of the same level of badness.

I have a first try at solving this that should hopefully land soon, waiting on review internally. Sorry for the delay here

tf 2.15, followed install guide on wsl2, same situation

Can confirm, with tf 2.15, and a complete system update, these errors persist, but do not appear to limit any functionality or performance.

The message suggests the factories in question are already present: Attempting to register factory for plugin cuDNN when one has already been registered

I don’t understand why the bug label was removed, even though it’s still a bug and an error. Not to mention, there’s a new version of TensorFlow 2.15 where it should have been fixed there, but it remains unresolved.

When 2.15.X didn’t work, tensorflow 2.16.1 (without CUDA) solved this issue for me. Python3.10, CUDA driver 12.2, Cuda Toolkit 12.1, cuDNN 8.9.5.

pip uninstall tensorflow && pip install tf-nightly[and-cuda]

@Romeo-CC @CarloWood Sorry for the misunderstanding. I see that this issue is related to type:build/install and as per issue triage workflow we limit the type label usage to one label per issue for internal tracking purposes hence I removed type:bug label . Having said that I acknowledge that this is an active problem and have notified the relevant team.

Same issue

(.venv) nikhil@nikhil:/Documents/Neko-Nik/Playground/Ai_LLM_ML$ nvidia-smi
Sun Dec 10 12:11:14 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     Off | 00000000:2B:00.0  On |                  N/A |
|  0%   37C    P8              N/A /  72W |    184MiB /  4096MiB |     17%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A     54353      G   /usr/lib/xorg/Xorg                           82MiB |
|    0   N/A  N/A     54480      G   /usr/bin/gnome-shell                         67MiB |
|    0   N/A  N/A     55973      G   /usr/bin/nextcloud                           11MiB |
+---------------------------------------------------------------------------------------+
(.venv) nikhil@nikhil:/Documents/Neko-Nik/Playground/Ai_LLM_ML$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
(.venv) nikhil@nikhil:/Documents/Neko-Nik/Playground/Ai_LLM_ML$ python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
2023-12-10 12:11:45.917541: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-10 12:11:45.917573: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-10 12:11:45.918814: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-10 12:11:45.924665: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-10 12:11:46.405375: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-12-10 12:11:46.751946: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-12-10 12:11:46.793957: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2023-12-10 12:11:46.794148: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Just a quick heads up @maulberto3, when you use nvidia-smi, the CUDA version you see in the top right corner is the highest version of CUDA compatible with your Nvidia driver, and not actually the currently installed CUDA version. Good to know since that’s confused me in the past.

I did delve deeper into this and found the following: Tensorflow 2.14.0 is initializing the above three libraries two times. The first time from here:

Breakpoint 3.4, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuBLAS", factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43b8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$15 = "cuBLAS"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, plugin_name="cuBLAS", 
    factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43b8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356470 in stream_executor::PluginRegistry::RegisterFactory<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, name="cuBLAS", 
    factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:239
#2  0x00007fffc9bfaa8b in stream_executor::cuda::initialize_cublas () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1515
#3  0x00007fff9b0d1ce8 in google_init_module_register_cublas () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1528
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7fff9cddbad0 <google_initializer_module_register_cublas>, 
    func=0x7fff9b0d1cdf <google_init_module_register_cublas()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fff9b0d4235 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1528
#6  0x00007fff9b0d424b in _GLOBAL__sub_I_cuda_blas.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1529
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
Breakpoint 3.5, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuDNN", factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43e8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$16 = "cuDNN"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, plugin_name="cuDNN", 
    factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43e8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356991 in stream_executor::PluginRegistry::RegisterFactory<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, name="cuDNN", 
    factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:240
#2  0x00007fffc9a0deab in stream_executor::initialize_cudnn () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9339
#3  0x00007fff9b17cfd2 in google_init_module_register_cudnn () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9355
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7fff9cddd8a8 <google_initializer_module_register_cudnn>, 
    func=0x7fff9b17cfc9 <google_init_module_register_cudnn()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fff9b18ba70 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9355
#6  0x00007fff9b18ba86 in _GLOBAL__sub_I_cuda_dnn.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9356
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2

and

Breakpoint 3.6, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuFFT", factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b4418)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$17 = "cuFFT"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, plugin_name="cuFFT", 
    factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b4418)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356ebb in stream_executor::PluginRegistry::RegisterFactory<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, name="cuFFT", 
    factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:241
#2  0x00007fffc9ab0ab1 in stream_executor::initialize_cufft () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:607
#3  0x00007fff9b228344 in google_init_module_register_cufft () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:618
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7fff9cddef90 <google_initializer_module_register_cufft>, 
    func=0x7fff9b22833b <google_init_module_register_cufft()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fff9b2283f2 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:618
#6  0x00007fff9b228408 in _GLOBAL__sub_I_cuda_fft.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:619
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2

The second time from here:

Breakpoint 3.5, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuDNN", factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43e8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$18 = "cuDNN"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, plugin_name="cuDNN", 
    factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43e8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356991 in stream_executor::PluginRegistry::RegisterFactory<stream_executor::dnn::DnnSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7897078 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, name="cuDNN", 
    factory=0x7fffc9a0dde4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:240
#2  0x00007fffc9a0deab in stream_executor::initialize_cudnn () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9339
#3  0x00007fffc9a0e01e in google_init_module_register_cudnn () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9355
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7ffff7897070 <google_initializer_module_register_cudnn>, 
    func=0x7fffc9a0e015 <google_init_module_register_cudnn()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fffc9a1cabc in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9355
#6  0x00007fffc9a1cad2 in _GLOBAL__sub_I_cuda_dnn.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9356
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
Breakpoint 3.6, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuFFT", factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b4418)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$19 = "cuFFT"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, plugin_name="cuFFT", 
    factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b4418)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356ebb in stream_executor::PluginRegistry::RegisterFactory<stream_executor::fft::FftSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7898758 <stream_executor::gpu::(anonymous namespace)::plugin_id_value>, name="cuFFT", 
    factory=0x7fffc9ab09eb <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:241
#2  0x00007fffc9ab0ab1 in stream_executor::initialize_cufft () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:607
#3  0x00007fffc9ab0c24 in google_init_module_register_cufft () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:618
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7ffff7898750 <google_initializer_module_register_cufft>, 
    func=0x7fffc9ab0c1b <google_init_module_register_cufft()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fffc9ab0cd2 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:618
#6  0x00007fffc9ab0ce8 in _GLOBAL__sub_I_cuda_fft.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:619
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2

and

Breakpoint 3.4, stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_name="cuBLAS", factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43b8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
75        if (factories->find(plugin_id) != factories->end()) {
(gdb) p plugin_name
$20 = "cuBLAS"
(gdb) bt
#0  stream_executor::PluginRegistry::RegisterFactoryInternal<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, plugin_name="cuBLAS", 
    factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>, factories=0x5555555b43b8)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:75
#1  0x00007fffd2356470 in stream_executor::PluginRegistry::RegisterFactory<stream_executor::blas::BlasSupport* (*)(stream_executor::internal::StreamExecutorInterface*)> (this=0x5555555a87c0, platform_id=0x7ffff79cde40 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, 
    plugin_id=0x7ffff7899308 <stream_executor::cuda::(anonymous namespace)::plugin_id_value>, name="cuBLAS", 
    factory=0x7fffc9bfa9c4 <_FUN(stream_executor::internal::StreamExecutorInterface*)>)
    at tensorflow/compiler/xla/stream_executor/plugin_registry.cc:239
#2  0x00007fffc9bfaa8b in stream_executor::cuda::initialize_cublas () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1515
#3  0x00007fffc9bfabfe in google_init_module_register_cublas () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1528
#4  0x00007fffc9a1d822 in stream_executor::port::Initializer::Initializer (this=0x7ffff7899300 <google_initializer_module_register_cublas>, 
    func=0x7fffc9bfabf5 <google_init_module_register_cublas()>) at ./tensorflow/compiler/xla/stream_executor/platform/default/initialize.h:29
#5  0x00007fffc9bfd14b in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535)
    at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1528
#6  0x00007fffc9bfd161 in _GLOBAL__sub_I_cuda_blas.cc(void) () at tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1529
#7  0x00007ffff7fceeee in call_init (env=0x7fffffffd518, argv=0x7fffffffd508, argc=1, l=<optimized out>) at dl-init.c:90
#8  call_init (l=<optimized out>, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:27
#9  0x00007ffff7fcefdc in _dl_init (main_map=0x7ffff7ffe2d0, argc=1, argv=0x7fffffffd508, env=0x7fffffffd518) at dl-init.c:137
#10 0x00007ffff7fe52d0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2

Got similar error on Ubuntu 22.04.3 LTS 😦

Can confirm it works with 2.16.1. For those who have to resort to using 2.9.0 workaround (some of my packages are limited to 2.15), use python <= 3.10 to install it.

It finally worked, with Tensorflow 2.16.1 (upgrade to lastest) > pip install --upgrade tensorflow

pip uninstall tensorflow && pip install tf-nightly[and-cuda]

This is not working either.

Same error on Ubuntu 22.04 LTS Install in WSL2 / Windows 11. Has anyone found solution to this?

  • python 3.11.7
  • CUDA 12.3
  • cudnn Version: 8.6.0.163
  • tensorflow 2.16 and 2.17(tf-nightly)

Works perfectly: You can check with what version works your tensorflow with this code

from tensorflow.python.platform import build_info as tf_build_info
print("cudnn_version",tf_build_info.build_info['cudnn_version'])

print("cuda_version",tf_build_info.build_info['cuda_version'])
2024-01-19 00:55:58.149551: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-19 00:55:58.808528: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
cudnn_version 8
cuda_version 12.3

I am having the same issue as FaisalAlj above, on Windows 10 with the same versions of CUDA and CuDNN. The package tensorflow[and-cuda] is not found by pip. I’ve tried different versions of python and tensorflow without success. In my case I’m using virtualenv rather than conda.

Edit 1: I appear to be able to install tensorflow[and-cuda] as long as I use quotes around the package, like: pip install "tensorflow[and-cuda]".

Edit 2: I still appear to be getting these messages however, so I’m not sure I’ve installed things correctly.

@AthiemoneZero Because it still does output a GPU device at the bottom of the log, I am training on GPU, just without cuDNN. It will be slower, but it is better than nothing or training on CPU.

Yeah. But I just found that when I downgrade to 2.13.0 version, errors in register won’t appear again. It looks like this:

(TF) ephys3@ZhouLab-Ephy3:~$ python3 -c "import tensorrt as trt;import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

2023-10-11 20:39:12.097457: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-11 20:39:12.130250: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-11 20:39:13.856721: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-11 20:39:13.870767: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-11 20:39:13.870941: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Although I haven’t figured out how to solve NUMA node error, I found some clues from another issue (as I operated all above in WSL Ubuntu). This bug seems not to be significant as explaination from NVIDIA forums . So I guess errors in register might have something with the latest version and errors in NUMA might be caused by OS enviroment. Hope this information would help some guys.

@AthiemoneZero Because it still does output a GPU device at the bottom of the log, I am training on GPU, just without cuDNN. It will be slower, but it is better than nothing or training on CPU.

It’s been five months, yet the problem remains.

You are right, what a shame, I gave up and went to Rust.

@AthiemoneZero Because it still does output a GPU device at the bottom of the log, I am training on GPU, just without cuDNN. It will be slower, but it is better than nothing or training on CPU.

Yeah. But I just found that when I downgrade to 2.13.0 version, errors in register won’t appear again. It looks like this:

(TF) ephys3@ZhouLab-Ephy3:~$ python3 -c "import tensorrt as trt;import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

2023-10-11 20:39:12.097457: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-11 20:39:12.130250: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-11 20:39:13.856721: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-11 20:39:13.870767: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
2023-10-11 20:39:13.870941: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] could not open file to read NUMA node: /sys/bus/pci/devices/0000:65:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Although I haven’t figured out how to solve NUMA node error, I found some clues from another issue (as I operated all above in WSL Ubuntu). This bug seems not to be significant as explaination from NVIDIA forums . So I guess errors in register might have something with the latest version and errors in NUMA might be caused by OS enviroment. Hope this information would help some guys.


NUMA non zero problem can be solved this way

  1. Check Nodes lspci | grep -i nvidia

01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 12GB] (rev a1) 01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1) The first line shows the address of the VGA-compatible device, NVIDIA Geforce, as 01:00 . Each one will be different, so let’s change this part carefully. 2. Check and change NUMA setting values If you run ls with this path /sys/bus/pci/devicecs/, you can see the following list: ls /sys/bus/pci/devices/

0000:00:00.0 0000:00:06.0 0000:00:15.0 0000:00:1c.0 0000:00:1f.3 0000:00:1f.6 0000:02:00.0 0000:00:01.0 0000:00:14.0 0000:00:16.0 0000:00:1d.0 0000:00:1f.4 0000:01:00.0 0000:00:02.0 0000:00:14.2 0000:00:17.0 0000:00:1f.0 0000:00:1f.5 0000:01:00.1 01:00.0 checked above is visible. However, 0000: is attached in front. 3. Check if it is connected. cat /sys/bus/pci/devices/0000:01:00.0/numa_node

-1

1 means no connection, and 0 means connected. 4. Fix it with the command below. sudo echo 0 | sudo tee -a /sys/bus/pci/devices/0000:01:00.0/numa_node

0

Thank you for your comment @qnlzgl . I have attempted to fix the issue in various ways, but none have proven successful for me.

ok so basically installing the required libraries for tensorflow 2.13 manually Build Configs before installing tensorflow and then installing tensorflow without gpu pip install tensorflow==2.13 works perfectly. My setup is Driver Version: 520.61.05 CUDA Version: 11.8 cudnn Version: 8.6.0.163. This should also work for 2.14 but use cudnn 8.7. Tensorflow 2.15 not tested.

works for me, thank you very much

Installing 2.16.0-dev20231212 with tf-nightly[and-cuda] resolved the issue for me.

Python 3.10, Ubuntu 22.04.3, WSL2, Windows 10

This has fixed the issue for me. For people dealing with errors using JAX and encounter cuSolver ran into an error. This fixes it.

tf-nightly[and-cuda]

Tried with tf-nightly-2.16.0.dev20231219 Still the same issues Python 3.10.12 | wsl2 Ubuntu 18.04.5 LTS | windows11

@ddunl Since I’m not using master, I had to change the diff a little. I applied the attached patch to 2.14.0. After that everything still compiled and the duplicated registration of the plugins vanished from my ‘hello-world’ test program.

HOWEVER - verification with objdump shows that now neither libtensorflow_framework.so.2.14.0 nor libtensorflow_cc.so.2.14.0 contains the global registration variable anymore!

Is there a way to test if things still really work? Cause I have the feeling that now the plugins aren’t registered at all.

Great, when trying to attach cuda_plugins.patch I get the message: We don’t support that file type.

Try again with GIF, JPEG, JPG, MOV, MP4, PNG, SVG, WEBM, CPUPROFILE, CSV, DMP, DOCX, FODG, FODP, FODS, FODT, GZ, JSON, JSONC, LOG, MD, ODF, ODG, ODP, ODS, ODT, PATCH, PDF, PPTX, TGZ, TXT, XLS, XLSX or ZIP.

I’ll paste it here, losing all required white-space requirements no doubt, but perhaps enough for inspection:

--- tensorflow/core/platform/build_config.default.bzl	2023-09-21 19:17:23.000000000 +0200
+++ tensorflow/core/platform/build_config.default.bzl	2023-11-16 14:32:12.072580107 +0100
@@ -1,12 +1,11 @@
 """OSS versions of Bazel macros that can't be migrated to TSL."""
 
+load("@local_config_rocm//rocm:build_defs.bzl", "if_rocm")
 load(
     "//tensorflow/tsl:tsl.bzl",
     "clean_dep",
     "if_libtpu",
 )
-load("@local_config_cuda//cuda:build_defs.bzl", "if_cuda")
-load("@local_config_rocm//rocm:build_defs.bzl", "if_rocm")
 load(
     "//third_party/mkl:build_defs.bzl",
     "if_mkl_ml",
@@ -26,11 +25,7 @@
         # core.
         clean_dep("//tensorflow/core/kernels:lookup_util"),
         clean_dep("//tensorflow/core/util/tensor_bundle"),
-    ] + if_cuda(
-        [
-            clean_dep("//tensorflow/compiler/xla/stream_executor:cuda_platform"),
-        ],
-    ) + if_rocm(
+    ] + if_rocm(
         [
             clean_dep("//tensorflow/compiler/xla/stream_executor:rocm_platform"),
             clean_dep("//tensorflow/compiler/xla/stream_executor/rocm:rocm_rpath"),

Same issue here on both WSL2 and Ubuntu.

>>> import tensorflow as tf
2023-10-28 13:19:49.412588: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-10-28 13:19:49.413083: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-10-28 13:19:49.437814: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

Hi @Ke293-x2Ek-Qe-7-aE-B ,

I have checked the installation on colab(linx environment) and observed same logs as per attached gist.

These logs seems generated from XLA compiler but GPU is able to detectable. Similar issue #62002 and already bought to Engineering team attention.

CC: @learning-to-play

@Ke293-x2Ek-Qe-7-aE-B You’re welcomed. BTW, I also followed the instruction to configure development including suitable version of bazel and clang-16, just before all my operation digging into conda env.

@Ke293-x2Ek-Qe-7-aE-B Apologize for my misunderstanding. I did the same in installing cuda toolkit as what you described above before I went directly to debug tf_gpu. I made sure my gpu and cuda could perform well as I have tried another task smoothly using cuda but without tf. What I concerned is some dependencies of tf have to be pre-installed in a conda env and this might be treated by [and-cuda] (my naive guess