tensorflow: Something seems like a bug in r1.8 when building from source code?

I try to build tensorflow with gpu support from source.

Basic information:

  • OS Platform and Distribution : ubuntu 14.04
  • TensorFlow installed from : source code, branch r1.7
  • TensorFlow version : r1.7
  • Bazel version : 0.11.1
  • CUDA/cuDNN version: 8.0/6.0
  • GPU model and memory : Titan X
  • Exact command to reproduce: bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

My configuration:

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]:
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]:
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 8.0

Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default
 is /usr/local/cuda]:

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 6.0

Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default
is /usr/local/cuda]:/home/chongyang/RemoteSensingImage/cuda

Do you wish to build TensorFlow with TensorRT support? [y/N]:
No TensorRT support will be enabled for TensorFlow.

Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]

Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
Configuration finished

The error I encountered:

ERROR: /home/chongyang/RemoteSensingImage/tensorflow/tensorflow/stream_executor/BUILD:52:1: C++ compilation of
 rule '//tensorflow/stream_executor:cuda_platform' failed (Exit 1)
In file included from ./tensorflow/stream_executor/platform/port.h:21:0,
                 from ./tensorflow/stream_executor/device_memory.h:30,
                 from ./tensorflow/stream_executor/dnn.h:30,
                 from ./tensorflow/stream_executor/cuda/cuda_dnn.h:22,
                 from tensorflow/stream_executor/cuda/cuda_dnn.cc:16:
tensorflow/stream_executor/cuda/cuda_dnn.cc: In constructor 'perftools::gputools::cuda::CudnnRnnDescriptor::Cu
dnnRnnDescriptor(perftools::gputools::cuda::CUDAExecutor*, cudnnHandle_t, int, int, int, cudnnRNNInputMode_t,
cudnnDirectionMode_t, cudnnRNNMode_t, cudnnDataType_t, cudnnDataType_t, const perftools::gputools::dnn::Algori
thmConfig&, float, tensorflow::uint64, perftools::gputools::ScratchAllocator*)':
tensorflow/stream_executor/cuda/cuda_dnn.cc:1188:29: error: 'class perftools::gputools::dnn::AlgorithmConfig' has no member named 'is_default'
     CHECK(algorithm_config_.is_default())

Something seems like a bug in r1.7?

As the error message show, ‘class perftools::gputools::dnn::AlgorithmConfig’ has no member named ‘is_default’. Then I check the code in tensorflow/stream_executor/cuda/cuda_dnn.cc and tensorflow/stream_executor/dnn.h

//tensorflow/stream_executor/dnn.h
class AlgorithmConfig {
 public:
  AlgorithmConfig() {}
  explicit AlgorithmConfig(AlgorithmDesc algorithm) : algorithm_(algorithm) {}
  AlgorithmConfig(AlgorithmDesc algorithm, AlgorithmDesc algorithm_no_scratch)
      : algorithm_(algorithm), algorithm_no_scratch_(algorithm_no_scratch) {}
  AlgorithmDesc algorithm() const { return algorithm_; }
  void set_algorithm(AlgorithmDesc val) { algorithm_ = val; }
  AlgorithmDesc algorithm_no_scratch() const { return algorithm_no_scratch_; }
  void set_algorithm_no_scratch(AlgorithmDesc val) {
    algorithm_no_scratch_ = val;
  }
  bool operator==(const AlgorithmConfig& other) const {
    return this->algorithm_ == other.algorithm_ &&
           this->algorithm_no_scratch_ == other.algorithm_no_scratch_;
  }
  bool operator!=(const AlgorithmConfig& other) const {
    return !(*this == other);
  }
  string ToString() const;

 private:
  AlgorithmDesc algorithm_;
  AlgorithmDesc algorithm_no_scratch_;
};
//tensorflow/stream_executor/cuda/cuda_dnn.cc
CudnnRnnDescriptor(CUDAExecutor* parent, cudnnHandle_t cudnn_handle,
                     int num_layers, int hidden_size, int input_size,
                     cudnnRNNInputMode_t input_mode,
                     cudnnDirectionMode_t direction_mode,
                     cudnnRNNMode_t rnn_mode, cudnnDataType_t data_type,
                     cudnnDataType_t compute_type,
                     const dnn::AlgorithmConfig& algorithm_config,
                     float dropout, uint64 seed,
                     ScratchAllocator* state_allocator)
      : parent_(parent),
        rnn_desc_(nullptr),
        num_layers_(num_layers),
        hidden_size_(hidden_size),
        input_size_(input_size),
        input_mode_(input_mode),
        direction_mode_(direction_mode),
        rnn_mode_(rnn_mode),
        data_type_(data_type),
        compute_type_(compute_type),
       ///////////////////////////////////////////////////
       ///// //algorithm_config_ is defined here!/////
       ///////////////////////////////////////////////////
        algorithm_config_(algorithm_config) {
....
       /////////////////////////////////////////////////////////////////////////////
       //algorithm_config_ call function is_default(), which occurs the bug//
        ////////////////////////////////////////////////////////////////////////////
       CHECK(algorithm_config_.is_default())
            << "Non-default algorithm not supported for CUDA version < 6.0";

Actually, algorithm_config_ has no member named is_default(). Only class AlgorithmDesc has this function.

Thanks for helping me to solve this problem and correct me!

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

Looks like this is resolved with cudnn upgrade. Closing the issue.

@tensorflowbutler right, the issue is remained unsolved.