tensorflow: sess.run([train_step]) freezes when using batch_normalization and Collection Update
Hi,
I have faced a strange situation with Tensorflow. I explored a lot about this problem but just found one other thread (unsolved) on StackOverflow. (here: https://stackoverflow.com/questions/47047124/tf-layers-batch-normalization-freezes-during-sess-run-1-5-0-dev20171031). So, I decided to ask it here.
Basically, when I call sess.run(), it freezes. By freeze, I mean, the GPU utilization is zero, I get no errors and the process is on GPU (GPU memory is allocated). I have the following code segment:
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
train_step1 = optimizer1.minimize(loss = loss_fill +lossL2+ loss_detection,var_list=vars)
When I change this part to:
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
pass
train_step1 = optimizer1.minimize(loss = loss_fill +lossL2+ loss_detection,var_list=vars)
which basically ignores some necessary moving average updates, it doesn’t freeze anymore. I have a lot of tf.layers.batch_normalization() instances in my code and this is the first time I am facing this issue.
Thanks
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 21 (6 by maintainers)
@ppwwyyxx you’re right, that issue looks suspiciously similar and probably has the same root cause. On another note, I’ve come up with a workaround for the sample code I posted above.
The following code:
can be transformed into:
which forces the batch norm ops to run before gradients on
resare computed, as desired.