tensorflow: Something seems like a bug in r1.8 when building from source code?
I try to build tensorflow with gpu support from source.
Basic information:
- OS Platform and Distribution : ubuntu 14.04
- TensorFlow installed from : source code, branch r1.7
- TensorFlow version : r1.7
- Bazel version : 0.11.1
- CUDA/cuDNN version: 8.0/6.0
- GPU model and memory : Titan X
- Exact command to reproduce:
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
My configuration:
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]:
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]:
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default
is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 6.0
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default
is /usr/local/cuda]:/home/chongyang/RemoteSensingImage/cuda
Do you wish to build TensorFlow with TensorRT support? [y/N]:
No TensorRT support will be enabled for TensorFlow.
Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]
Do you want to use clang as CUDA compiler? [y/N]:
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
Configuration finished
The error I encountered:
ERROR: /home/chongyang/RemoteSensingImage/tensorflow/tensorflow/stream_executor/BUILD:52:1: C++ compilation of
rule '//tensorflow/stream_executor:cuda_platform' failed (Exit 1)
In file included from ./tensorflow/stream_executor/platform/port.h:21:0,
from ./tensorflow/stream_executor/device_memory.h:30,
from ./tensorflow/stream_executor/dnn.h:30,
from ./tensorflow/stream_executor/cuda/cuda_dnn.h:22,
from tensorflow/stream_executor/cuda/cuda_dnn.cc:16:
tensorflow/stream_executor/cuda/cuda_dnn.cc: In constructor 'perftools::gputools::cuda::CudnnRnnDescriptor::Cu
dnnRnnDescriptor(perftools::gputools::cuda::CUDAExecutor*, cudnnHandle_t, int, int, int, cudnnRNNInputMode_t,
cudnnDirectionMode_t, cudnnRNNMode_t, cudnnDataType_t, cudnnDataType_t, const perftools::gputools::dnn::Algori
thmConfig&, float, tensorflow::uint64, perftools::gputools::ScratchAllocator*)':
tensorflow/stream_executor/cuda/cuda_dnn.cc:1188:29: error: 'class perftools::gputools::dnn::AlgorithmConfig' has no member named 'is_default'
CHECK(algorithm_config_.is_default())
Something seems like a bug in r1.7?
As the error message show, ‘class perftools::gputools::dnn::AlgorithmConfig’ has no member named ‘is_default’. Then I check the code in tensorflow/stream_executor/cuda/cuda_dnn.cc and tensorflow/stream_executor/dnn.h
//tensorflow/stream_executor/dnn.h
class AlgorithmConfig {
public:
AlgorithmConfig() {}
explicit AlgorithmConfig(AlgorithmDesc algorithm) : algorithm_(algorithm) {}
AlgorithmConfig(AlgorithmDesc algorithm, AlgorithmDesc algorithm_no_scratch)
: algorithm_(algorithm), algorithm_no_scratch_(algorithm_no_scratch) {}
AlgorithmDesc algorithm() const { return algorithm_; }
void set_algorithm(AlgorithmDesc val) { algorithm_ = val; }
AlgorithmDesc algorithm_no_scratch() const { return algorithm_no_scratch_; }
void set_algorithm_no_scratch(AlgorithmDesc val) {
algorithm_no_scratch_ = val;
}
bool operator==(const AlgorithmConfig& other) const {
return this->algorithm_ == other.algorithm_ &&
this->algorithm_no_scratch_ == other.algorithm_no_scratch_;
}
bool operator!=(const AlgorithmConfig& other) const {
return !(*this == other);
}
string ToString() const;
private:
AlgorithmDesc algorithm_;
AlgorithmDesc algorithm_no_scratch_;
};
//tensorflow/stream_executor/cuda/cuda_dnn.cc
CudnnRnnDescriptor(CUDAExecutor* parent, cudnnHandle_t cudnn_handle,
int num_layers, int hidden_size, int input_size,
cudnnRNNInputMode_t input_mode,
cudnnDirectionMode_t direction_mode,
cudnnRNNMode_t rnn_mode, cudnnDataType_t data_type,
cudnnDataType_t compute_type,
const dnn::AlgorithmConfig& algorithm_config,
float dropout, uint64 seed,
ScratchAllocator* state_allocator)
: parent_(parent),
rnn_desc_(nullptr),
num_layers_(num_layers),
hidden_size_(hidden_size),
input_size_(input_size),
input_mode_(input_mode),
direction_mode_(direction_mode),
rnn_mode_(rnn_mode),
data_type_(data_type),
compute_type_(compute_type),
///////////////////////////////////////////////////
///// //algorithm_config_ is defined here!/////
///////////////////////////////////////////////////
algorithm_config_(algorithm_config) {
....
/////////////////////////////////////////////////////////////////////////////
//algorithm_config_ call function is_default(), which occurs the bug//
////////////////////////////////////////////////////////////////////////////
CHECK(algorithm_config_.is_default())
<< "Non-default algorithm not supported for CUDA version < 6.0";
Actually, algorithm_config_ has no member named is_default(). Only class AlgorithmDesc has this function.
Thanks for helping me to solve this problem and correct me!
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 16 (6 by maintainers)
Looks like this is resolved with cudnn upgrade. Closing the issue.
@tensorflowbutler right, the issue is remained unsolved.