serving: [Bug] libcupti related error in official tensorflow/serving image prevents tensorboard profiler from gathering GPU usage

Bug Report

If this is a bug report, please fill out the following form in full:

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
TensorFlow Serving installed from (source or binary): official docker image: tensorflow/serving-2.3.0-gpu
TensorFlow Serving version: 2.3.0-gpu

Describe the problem

I want to profile GPU usage of a model served in GPU version of tensorflow/serving container with tensorboard. However, after clikcing the “caputure” button in tensorboard UI, it prompts that <tensorflow-serving-container-id>: Failed to load libcupti (is it installed and accessible?) , and I got the profiling results on CPU only without GPU related data displayed. It turns out that /usr/local/cuda/extras/CUPTI/lib64 (path to libcupti) is not included in $LD_LIBRARY_PATH in official tensorflow/serving image by default. I tried launching container from tensorflow/serving:2.3.0-gpu with -e /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 and it prompts <tensorflow-serving-container-id>: Insufficient privilege to run libcupti (you need root permission). instead, and no GPU usage was displayed either, despite that I’ve run both containers as root.

Exact Steps to Reproduce

For reproducing the first error that prompts "<tensorflow-serving-container-id>: Failed to load libcupti (is it installed and accessible?) ", do
execute docker run -d --name "${NAME}" -p 8500:8500 -p 8501:8501 -v "${PWD}/models:/models" -v /etc/localtime:/etc/localtime:ro -v "${PWD}/config:/etc/tensorflow-serving/config:ro" -v "${PWD}/batching_config:/etc/tensorflow-serving/batching_config:ro" --gpus all ${IMAGE} --model_config_file=/etc/tensorflow-serving/config --enable_batching=true --batching_parameters_file=/etc/tensorflow-serving/batching_config
- Launch tensorboard in another container (I used tensorflow/tensoflow:2.3.0-gpu with tensorboard-plugin-profile installed)with --link for routing grpc traffics to container tensorflow-serving
- continuously send prediction request to the model
- Click “capture” and enter grpc port and url of tensorflow/serving
For reproducing the latter error,
- Change the command in the first step to docker run -d --name "${NAME}" -p 8500:8500 -p 8501:8501 -v "${PWD}/models:/models" -v /etc/localtime:/etc/localtime:ro -v "${PWD}/config:/etc/tensorflow-serving/config:ro" -v "${PWD}/batching_config:/etc/tensorflow-serving/batching_config:ro" -e LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64 --gpus all ${IMAGE} --model_config_file=/etc/tensorflow-serving/config --enable_batching=true --batching_parameters_file=/etc/tensorflow-serving/batching_config (i.e. overriding the LD_LIBRARY_PATH with -e option)
- Follow the rest of the steps for replicating the previous error.

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.