tensorflow: sess.run(tf.global_variables_initializer()) is slow after installing 1.0.0.
The issue
sess.run(tf.global_variables_initializer()) intermittently takes a long time since installing 1.0.0. I believe this also happened on my Python 2 0.12 install, but only after installing 1.0.0 for Python 3. This particular run was in Python 2, and the initializer took 32 seconds - see bottom of this post (and the attached cprofile output) for more detail.
Strangely, it seems like the more I run the same file, the faster it runs.
What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?
None.
Environment info
Operating System: Ubuntu 16.04
Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*):
ls -l /usr/local/cuda/lib64/libcud*
-rw-r--r-- 1 root root 560184 Oct 27 16:26 /usr/local/cuda/lib64/libcudadevrt.a
lrwxrwxrwx 1 root root 16 Oct 27 16:26 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.8.0
lrwxrwxrwx 1 root root 19 Oct 27 16:26 /usr/local/cuda/lib64/libcudart.so.8.0 -> libcudart.so.8.0.27
-rwxr-xr-x 1 root root 394472 Oct 27 16:26 /usr/local/cuda/lib64/libcudart.so.8.0.27
-rw-r--r-- 1 root root 737516 Oct 27 16:26 /usr/local/cuda/lib64/libcudart_static.a
-rwxr-xr-x 1 root root 60696704 Oct 27 16:31 /usr/local/cuda/lib64/libcudnn.so
-rwxr-xr-x 1 root root 60696704 Oct 27 16:31 /usr/local/cuda/lib64/libcudnn.so.5
-rwxr-xr-x 1 root root 60696704 Oct 27 16:31 /usr/local/cuda/lib64/libcudnn.so.5.1.3
-rw-r--r-- 1 root root 59715990 Oct 27 16:31 /usr/local/cuda/lib64/libcudnn_static.a
If installed from binary pip package, provide:
- A link to the pip package you installed:
sudo pip install tensorflow-gpuon 2/21/17 - The output from
python -c "import tensorflow; print(tensorflow.__version__)". 1.0.0
If possible, provide a minimal reproducible example (We usually don’t have time to read hundreds of lines of your code)
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import time
from collections import deque
import numpy as np
print('Importing tensorflow')
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
xavier = tf.contrib.layers.xavier_initializer
batch_norm = tf.contrib.layers.batch_norm
tf.set_random_seed(1)
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_string('data_dir', 'data/', 'Directory for storing data')
print('Reading data.')
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
initialization_variance = 0.1
print('Creating a session.')
sess = tf.Session()
def fully_connected_layer(inputs, output_size, scope, nonlinearity=True):
with tf.variable_scope(scope):
input_size = inputs.get_shape()[1].value
weights = tf.get_variable('w',
[input_size, output_size],
initializer=tf.constant_initializer(0.0))
bias = tf.get_variable('b',
[output_size],
initializer=tf.constant_initializer(0.0))
multiplied = tf.matmul(inputs, weights)
return tf.nn.elu(multiplied + bias) if nonlinearity else multiplied + bias
print('Setting up the model.')
input_images = tf.placeholder(tf.float32, [None, 784])
labels = tf.placeholder(tf.float32, [None, 10]) # labels
learning_rate_placeholder = tf.placeholder(tf.float32, name='learning-rate')
layer_a = fully_connected_layer((input_images), 50, 'a')
for i in range(5):
layer_a = batch_norm(fully_connected_layer(layer_a, 50, str(i)))
layer_b = fully_connected_layer(layer_a, 10, 'b', nonlinearity=False)
predictions = tf.nn.softmax(layer_b)
cross_entropy = tf.reduce_mean(-tf.reduce_sum(labels * tf.log(predictions), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(learning_rate_placeholder).minimize(cross_entropy)
print('Initializing variables.')
sess.run(tf.global_variables_initializer())
What other attempted solutions have you tried?
None.
Logs or other output that would be helpful
Some output from python2 -m cProfile -o out.profile 1-mnist-simple.py. This is a call to sess.run(tf.global_variables_initializer()).

About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 19 (8 by maintainers)
Alas! I’m going to close, but with sadness. Let us know if a clearer picture ever emerges.
I had this same issue with my own docker container, even while using the ‘latest-devel-gpu’ container as @afri mentioned.
However, @chrisranderson, your comment about opencv made me realize what the problem was. You’re right, compiling opencv with cuda support caused the slow startup for me.
By compiling opencv with
-D WITH_CUDA=OFF, the problem is fixed, and my models run almost instantly.EDIT: Also, if you want a quick fix, you can also just try importing cv2 before importing tensorflow. That worked for me too.