serving: Docker with GPU failed call to cuInit: CUresult(-1)

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 17.10
TensorFlow Serving installed from (source or binary): binary
TensorFlow Serving version: 1.9
Docker version: 18.03.1-ce
Nvidia docker version: 2.0.3

Describe the problem

I’m attempting to run a tensorflow serving in a container which needs GPU. When I’m starting the container and use it I don’t see the process in host of nvidia-smi Looking at the log I saw few weird issues.

Exact Steps to Reproduce

This is a simple example which shows the same error docker run -p 8501:8501 \ -v /tmp/tfserving/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_three:/models/half_plus_three \ -e MODEL_NAME=half_plus_three -t tensorflow/serving:1.9.0-devel-gpu \ tensorflow_model_server \ --port=8500 \ --rest_api_port=8501 \ --model_name=half_plus_three \ --model_base_path=/models/half_plus_three

Source code / logs

2018-07-26 05:55:57.044214: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-07-26 05:55:57.044874: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUresult(-1) 2018-07-26 05:55:57.045256: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 3
Comments: 17

Most upvoted comments

run rm /usr/local/cuda/lib64/stubs/libcuda.so.1 fixed my problem

+14

CLIsVeryOK on Aug 13, 2018

A good way to test if the GPU drivers of your container are setup correctly before you start building the model server is this script, which should return the details of your video card without any dependencies: https://gist.github.com/f0k/63a664160d016a491b2cbea15913d549

ps-account on Jul 26, 2018

I spent the last day debugging the same error in a similar configuration (Ubuntu 16.04, TFServing 1.9, Tesla P100). The GPU worked fine in tensorflow/tensorflow. Running in tensorflow/serving:nightly-devel-gpu fixed the problem.

https://github.com/tensorflow/serving/commit/4cbac38c307ea11527d0e45a3b18fd41f1b67601#diff-5442e32f8ca43e5ee752e24804404913

rydee on Jul 26, 2018

docker run -p 8501:8501 \ -v

You need to use nvidia-docker to run the GPU build.

gautamvasudevan on Jul 26, 2018