tensorflow: Missing input file mpi:mpio.h

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution: Linux Ubuntu 16.04)
  • TensorFlow installed from: source
  • TensorFlow version: master
  • Python version: 3.5.2
  • Bazel version (if compiling from source): 0.5.2
  • CUDA/cuDNN version: 8.0
  • GPU model and memory: K80
  • Exact command to reproduce:
#!/usr/bin/env bash
# Only the compilation step for tensorflow is in this script, for clarity.

git clone https://github.com/tensorflow/tensorflow
cd ./tensorflow
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda-8.0/targets/x86_64-linux/lib:/usr/lib/x86_64-linux-gnu/
export PYTHON_BIN_PATH="/home/ubuntu/anaconda3/bin/python"
export PYTHON_LIB_PATH="/home/ubuntu/anaconda3/lib/python3.6/site-packages"
export TF_NEED_JEMALLOC=1
export TF_NEED_GCP=0
export TF_NEED_HDFS=0
export TF_ENABLE_XLA=0
export TF_NEED_VERBS=0
export TF_NEED_OPENCL=0
export TF_NEED_CUDA=1
export TF_CUDA_CLANG=0
export TF_NEED_MPI=1
export MPI_HOME="/usr/lib/openmpi"
export GCC_HOST_COMPILER_PATH="/usr/bin/gcc"
export CUDA_VERSION='8.0'
export CUDNN_VERSION='6'
export CUDNN_INSTALL_PATH=/usr/local/cuda
export CUDA_COMPUTE_CAPABILITIES='3.7'
export CUDA_PATH='/usr/local/cuda'
export CUDA_PATH_LINUX='/opt/cuda'
yes "" | ./configure
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package && \
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

Describe the problem

When turning on mpi, the compile fails with the following error:

INFO: Found 1 target...
ERROR: missing input file '//third_party/mpi:mpicxx.h'.
ERROR: missing input file '//third_party/mpi:mpi.h'.
ERROR: missing input file '//third_party/mpi:mpio.h'.
ERROR: /home/ubuntu/scripts/tensorflow/third_party/mpi/BUILD:18:1: //third_party/mpi:mpi: missing input file '//third_party/mpi:mpicxx.h'.
ERROR: /home/ubuntu/scripts/tensorflow/third_party/mpi/BUILD:18:1: //third_party/mpi:mpi: missing input file '//third_party/mpi:mpio.h'.
ERROR: /home/ubuntu/scripts/tensorflow/third_party/mpi/BUILD:18:1: //third_party/mpi:mpi: missing input file '//third_party/mpi:mpi.h'.

Source code / logs

before running the build script I install bazel 0.5.2 (0.5.3 breaks the build)

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@episodeyang
I’m using the latest OpenMPI and MVAPICH versions, built from source. I’m not sure what the Ubuntu pre-built packages do and do not install. In this case it looks like the configuration script can not find the ‘mpi_portable_platform.h’ file and incorrectly assumes you are using MVAPICH. I would get the latest MVAPICH version which is confirmed to work (http://mvapich.cse.ohio-state.edu/downloads/ )

Note that there has been a recent change which breaks the compilation until pull request #11935 is merged.

Yes, give it a shot with setting: export MPI_HOME=/usr/lib/openmpi that should result in correct symlinks.

There is no need to pass any extra flags during build. I downloaded OpenMPI 3.0.0 source and compiled it with C++ support although C++ support is deprecated. Then pointed tensorflow to use new installation of OpenMPI. After that it compiled fine.

@jbedorf @eLvErDe @ahaider3 @xgh45 @episodeyang I have successfully tweaked the variable OMPI_SKIP_MPICXX for a successful building (for trunk version of tensorflow). However, it seems that we have to use an enviroment variable CC_OPT_FLAGS=“-DOMPI_SKIP_MPICXX=1 -march=native” instead. That’s because only in this way we can pass the variable into the source code layer and bypass the original definition of the macro OMPI_SKIP_MPICXX in the source code (for this case mpi.h). CC_OPT_FLAGS is the variable taken by configure.py (for trunk version of tensorflow) to assign the values for --copt & --cxxopt in bazel build process.