serving: GPU Build Issues Unhandled exception

We are having a few issues building GPU on the latest HEAD. Initially got past some local_cuda_config errors by pinning to the latest head last yesterday (https://github.com/tensorflow/tensorflow/commit/a5f8f42). But we are now receiving this error when attempting to build:

Basing on CUDA 7.5, cuDNN 5.1.3.

Bazel 0.3.1

# env
CUDA_HOME=/usr/local/cuda
CUDA_PATH=/usr/local/cuda-7.5
CUDA_TOOLKIT_PATH=/usr/local/cuda-7.5
CUDA_VERSION=7.5
LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda-7.5/targets/x86_64-linux/lib:/usr/lib/x86_64-linux-gnu/
LIBRARY_PATH=/usr/local/cuda/lib64/stubs:
TF_CUDA_COMPUTE_CAPABILITIES=3.0
TF_CUDA_VERSION=7.5
TF_CUDNN_VERSION=5.1.3
TF_NEED_CUDA=1
bazel build -c opt --config=cuda --verbose_failures tensorflow_serving/...
INFO: Reading 'startup' options from /root/.bazelrc: --batch
____Loading package: tensorflow_serving/servables/tensorflow/testdata
____Loading...
Unhandled exception thrown during build; message: Unrecoverable error while evaluating node 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@8db88dd6' (requested by nodes 'CONFIGURATION_COLLECTION:com.google.devtools.build.lib.skyframe.ConfigurationCollectionValue$ConfigurationCollectionKey@54fe1e0', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@e90862b7', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@ff24d609')
____Elapsed time: 1.529s
java.lang.RuntimeException: Unrecoverable error while evaluating node 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@8db88dd6' (requested by nodes 'CONFIGURATION_COLLECTION:com.google.devtools.build.lib.skyframe.ConfigurationCollectionValue$ConfigurationCollectionKey@54fe1e0', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@e90862b7', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@ff24d609')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:1070)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:474)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: com.google.devtools.build.lib.packages.NoSuchTargetException: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /root/.cache/bazel/_bazel_root/1b03e6b0b95a8320062041ca0659e00e/external/org_tensorflow/third_party/gpus/crosstool/BUILD
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.getCrosstoolProtofromBuildFile(CrosstoolConfigurationLoader.java:179)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.findCrosstoolConfiguration(CrosstoolConfigurationLoader.java:239)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.readCrosstool(CrosstoolConfigurationLoader.java:281)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.createParameters(CppConfigurationLoader.java:128)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.create(CppConfigurationLoader.java:73)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.create(CppConfigurationLoader.java:48)
    at com.google.devtools.build.lib.skyframe.ConfigurationFragmentFunction.compute(ConfigurationFragmentFunction.java:78)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:1016)
    ... 4 more
Caused by: com.google.devtools.build.lib.packages.NoSuchTargetException: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /root/.cache/bazel/_bazel_root/1b03e6b0b95a8320062041ca0659e00e/external/org_tensorflow/third_party/gpus/crosstool/BUILD
    at com.google.devtools.build.lib.packages.Package.makeNoSuchTargetException(Package.java:559)
    at com.google.devtools.build.lib.packages.Package.getTarget(Package.java:543)
    at com.google.devtools.build.lib.skyframe.SkyframePackageLoaderWithValueEnvironment.getTarget(SkyframePackageLoaderWithValueEnvironment.java:71)
    at com.google.devtools.build.lib.skyframe.ConfigurationFragmentFunction$ConfigurationBuilderEnvironment.getTarget(ConfigurationFragmentFunction.java:193)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.getCrosstoolProtofromBuildFile(CrosstoolConfigurationLoader.java:177)
    ... 11 more
java.lang.RuntimeException: Unrecoverable error while evaluating node 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@8db88dd6' (requested by nodes 'CONFIGURATION_COLLECTION:com.google.devtools.build.lib.skyframe.ConfigurationCollectionValue$ConfigurationCollectionKey@54fe1e0', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@e90862b7', 'CONFIGURATION_FRAGMENT:com.google.devtools.build.lib.skyframe.ConfigurationFragmentValue$ConfigurationFragmentKey@ff24d609')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:1070)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:474)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: com.google.devtools.build.lib.packages.NoSuchTargetException: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /root/.cache/bazel/_bazel_root/1b03e6b0b95a8320062041ca0659e00e/external/org_tensorflow/third_party/gpus/crosstool/BUILD
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.getCrosstoolProtofromBuildFile(CrosstoolConfigurationLoader.java:179)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.findCrosstoolConfiguration(CrosstoolConfigurationLoader.java:239)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.readCrosstool(CrosstoolConfigurationLoader.java:281)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.createParameters(CppConfigurationLoader.java:128)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.create(CppConfigurationLoader.java:73)
    at com.google.devtools.build.lib.rules.cpp.CppConfigurationLoader.create(CppConfigurationLoader.java:48)
    at com.google.devtools.build.lib.skyframe.ConfigurationFragmentFunction.compute(ConfigurationFragmentFunction.java:78)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:1016)
    ... 4 more
Caused by: com.google.devtools.build.lib.packages.NoSuchTargetException: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /root/.cache/bazel/_bazel_root/1b03e6b0b95a8320062041ca0659e00e/external/org_tensorflow/third_party/gpus/crosstool/BUILD
    at com.google.devtools.build.lib.packages.Package.makeNoSuchTargetException(Package.java:559)
    at com.google.devtools.build.lib.packages.Package.getTarget(Package.java:543)
    at com.google.devtools.build.lib.skyframe.SkyframePackageLoaderWithValueEnvironment.getTarget(SkyframePackageLoaderWithValueEnvironment.java:71)
    at com.google.devtools.build.lib.skyframe.ConfigurationFragmentFunction$ConfigurationBuilderEnvironment.getTarget(ConfigurationFragmentFunction.java:193)
    at com.google.devtools.build.lib.rules.cpp.CrosstoolConfigurationLoader.getCrosstoolProtofromBuildFile(CrosstoolConfigurationLoader.java:177)
    ... 11 more

It seems like the crosstool/BUILD file is empty and the CROSSTOOL.tpl hasn’t been converted to a CROSSTOOL file.

$ ls tensorflow/third_party/gpus/crosstool/
BUILD  BUILD.tpl  CROSSTOOL.tpl  LICENSE  clang

$ cat tensorflow/third_party/gpus/crosstool/BUILD
$

Any insight or place to poke around would be awesome, thanks!

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 19 (2 by maintainers)

Most upvoted comments

I had the the same issue. Firstly, to check your TF Cuda config is correct: cd tensorflow && bazel query 'kind(rule, @local_config_cuda//...)' --output label_kind should produce:

config_setting rule @local_config_cuda//cuda:using_nvcc
config_setting rule @local_config_cuda//cuda:using_clang_opt
config_setting rule @local_config_cuda//cuda:using_clang
config_setting rule @local_config_cuda//cuda:darwin
cc_library rule @local_config_cuda//cuda:cupti_headers
cc_library rule @local_config_cuda//cuda:cupti_dsos
cc_library rule @local_config_cuda//cuda:cudart_static
cc_library rule @local_config_cuda//cuda:cuda
cc_library rule @local_config_cuda//cuda:curand
cc_library rule @local_config_cuda//cuda:cufft
cc_library rule @local_config_cuda//cuda:cudnn
cc_library rule @local_config_cuda//cuda:cudart
cc_library rule @local_config_cuda//cuda:cuda_headers
cc_library rule @local_config_cuda//cuda:cublas
cc_toolchain_suite rule @local_config_cuda//crosstool:toolchain
cc_toolchain rule @local_config_cuda//crosstool:cc-compiler-local
cc_toolchain rule @local_config_cuda//crosstool:cc-compiler-darwin
filegroup rule @local_config_cuda//crosstool:empty

running the same command from the tensorflow serving repository root will fail (with errors) for 2 reasons:

1. the crosstool in tools/bazel.rc is invalid (AFAIK). change @org_tensorflow//third_party/gpus/crosstool to @local_config_cuda//crosstool:toolchain.

2. the cuda_configure repository rule will fail (haven’t looked in to why exactly), but essentially an bazel clean --expunge && export TF_NEED_CUDA=1 will fix this.

Then, run bazel query 'kind(rule, @local_config_cuda//...)' again and all is well (for me at least); the cuda tool chain should be created in $(bazel info output_base)/external/local_config_cuda/cuda

I have the same issue, not inside a Docker container, in a regular build with: bazel build -c opt --config=cuda tensorflow_serving/…

Building Tensorflow standalone with GPU support works fine, but not as part of a tensorflow serving build. (Cuda 8.0 and Cudnn 5.1.5)

Hello, I tested @rayglover-ibm 's solution, and it works well for the latest master of tf-serving. I am wondering why this fix had not been merged to the repo of tf-serving? Is there any plan to support GPU officially? Thanks!

+cc @damienmg since the exception might not be expected.

I’ll try to repro this to debug.