tensorflow: Compilation fails on Ubuntu 20.04 when using TensorRT 8.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version: 2.4.1, 2.5, etc
  • Python version: 3.8
  • Installed using virtualenv? pip? conda?: no, built from source
  • Bazel version (if compiling from source): 3.1 (for TF 2.4.1), 3.7.2 (for TF 2.5.0-rcx)
  • GCC/Compiler version (if compiling from source): gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
  • CUDA/cuDNN version: Cuda 11.1, cudnn8 (8.0.5.39-1+cuda11.1) or Cuda-11-2, libcudnn 8.1.1, 8.2,
  • GPU model and memory: GTX-1080ti
  • TensorRT (crucial): 8.0.0-1+cuda11.0, or 8.0.0-1+cuda11.3

Describe the problem When compiling with support for TensorRT 8 (via libnvinfer8), compilation fails (log is below).

Provide the exact sequence of commands / steps that you executed before running into the problem When configuring the build, make sure you build with TensorRT support, and make sure TensorRT version 8 is selected. Build TF as usual. Compilation will fail.

If you install TensorRT version 7 manually (from debs available for Ubuntu 18.04), compilation will complete just fine.

Any other info / logs Relevant error: C++ compilation of rule '//tensorflow/compiler/tf2tensorrt:tensorrt_stub' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command

In file included from bazel-out/k8-opt/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers/third_party/tensorrt/NvInfer.h:54, from tensorflow/compiler/tf2tensorrt/stub/nvinfer_stub.cc:17: bazel-out/k8-opt/bin/external/local_config_tensorrt/_virtual_includes/tensorrt_headers/third_party/tensorrt/NvInferRuntime.h:2264:51: note: from previous declaration 'nvinfer1::IPluginRegistry* getPluginRegistry() noexcept' 2264 | extern "C" TENSORRTAPI nvinfer1::IPluginRegistry* getPluginRegistry() noexcept;

Full log here: gesdm-tf2.5.0rc3-error.txt

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 24 (18 by maintainers)

Most upvoted comments

As to your point about the tensorRT release notes breaking APIs, I can only say that it’s fair that is acknowledged. However, that change was done in such a way that it literally breaks people systems with no way to recover them in a usable way. As of now, there is no straightforward way to compile tensorflow with tensorRT on Ubuntu 20.04. Essentially the expectation is for them to move people to the latest and greatest, with the previous version basically losing any support. If it permanently breaks stuff, well too bad.

That is why a warning on the TF side of things would be useful, if for no other reason that it would save massive amount of time and effort in people trying to figure out what’s wrong.

Indeed. The issue again is with the questionable way NVidia provides packages across major version numbers. To give you a sense, you can actually have both TensorRT 7 and 8 installed through the libnvinfer8 and libnvinfer7packages and related plugins. However, the dev package (libnvinfer-dev) is version agnostic, which means that when NVidia updates TensorRT to version 8, it forces the upgrade also on the dev package to version 8. The issue is made worse by the fact that NVidia not only upgraded the packages to 8, but removed all the previous version 7, so there is no way to actively get them through apt. I am using those still available for Ubuntu 18.04, manually installed from packages obtained by decompressing the massive package from NVidia.

TensorRT has been just released, but Ubuntu20.04 has been around for about a year, which makes this move from NVidia not only questionable, but rather inconsiderate.

Maybe a line specifying that only TensorRT 7 is supported in the TensorRT section of the Release notes for TF 2.5…

Alternatively, or in addition, this page is quite outdated (Ubuntu 20.04 is not even mentioned):

https://www.tensorflow.org/install/gpu

The point is, the support page could set some very conservative version numbers for the GPU libraries needed. However, those are not available in the current LTS (and supported) version of Ubuntu, which means that someone trying to build TF with it will not be able to successfully complete it. I understand this is less of a problem for TF as much as it is an issue in NVidia repos dropping support for older versions of TensorRT, but a clarification would go a long way.

Update: it looks like the issue is with tensorRT version 8, regardless of the version of cuda or cudnn, as well as for TF 2.4 or 2.5. When trying to compile with tensorRT 8 (specifically with libnvinfer8, libnvinfer-dev, libnvinfer-plugin8 libnvinfer-plugin-dev), it fails. However, when using tensorRT version 7, it seems to compile just fine. In fact I managed to compile earlier TF 2.5.0rc just fine.

I would have no problem using tensorRT 7. Unfortunately it is no longer available for Ubuntu 20.04. The repos and builds in NVidia repos are only version 8. I was able to test version 7 by manually getting the Ubuntu 18.04 debs and installing them manually on 20.04, which is far less than ideal.

So it seems there may be issues with TF compatibility with tensorRT 8, and NVidia is quite forcing an upgrade that essentially make impossible to build TF with with tensorrt support using Ubuntu 20.04, unless one uses manual installation of libraries.

Happy to run more test.

Interesting point, TensorRT 8 comes after TF 2.5 and we have no ideas that it is not going to work with TF 2.5. The tensorRT 8 release note says there is breaking API changes, which can be interpreted as anything that work with TensorRT 7 but released before TensorRT 8 can be broken.