tensorflow: A bug with slim.create_global_step
When I use slim.create_global_step in distributed training, the slim.create_global_step can make the workers freeze just after session has been created.
This can be reproduced by just replacing the global_step = tf.Variable(0, name="global_step", trainable=False) with global_step = slim.create_global_step() in the mnist_replica.py in the dist_test
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 21 (10 by maintainers)
@wangyum According to you log,
Invalid size in bundle entry: key global_step; stored size 8; expected size 4I think there is a dtype mismatch for the global_step variable