tensorflow: Following instructions in batch_normalization docs produces an exception
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 17.10
- TensorFlow installed from (source or binary): Binary (pip install tensorflow)
- TensorFlow version (use command below): v1.4.0-rc1-11-g130a514 1.4.0
- Python version: 3.6.3
- CUDA/cuDNN version: N/A
- GPU model and memory: N/A
- Exact command to reproduce: (see the “Source code” section)
Describe the problem
I was following the instructions for updating moving_mean and moving_variance by using the code snippet provided in batch_normalization documentation and it resulted in an “tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value” exception.
Source code / logs
Full instructions to reproduce:
git clone -b control-dependencies-exc https://github.com/naktinis/language-id.git ctrl-dep-exc
cd ctrl-dep-exc/
python3 -m venv venv
. venv/bin/activate
pip install tensorflow==1.4.0 Pillow
python3 main.py --image-dir test_data/ --label-file test_data/labels.csv --model rnn
The specific code change that was enough to produce the exception (seems to match the snippet in the official documentation): https://github.com/naktinis/language-id/commit/50740f
Full exception:
Traceback (most recent call last):
File "main.py", line 173, in <module>
tf.app.run(main=run_experiment)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 165, in run_experiment
hparams=params,
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 218, in run
return _execute_schedule(experiment, schedule)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/learn_runner.py", line 46, in _execute_schedule
return task()
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 625, in train_and_evaluate
self.train(delay_secs=0)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 367, in train
hooks=self._train_monitors + extra_hooks)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 807, in _call_train
hooks=hooks)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 783, in _train_model
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 521, in run
run_metadata=run_metadata)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 892, in run
run_metadata=run_metadata)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 967, in run
raise six.reraise(*original_exc_info)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 952, in run
return self._sess.run(*args, **kwargs)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 1024, in run
run_metadata=run_metadata)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py", line 827, in run
return self._sess.run(*args, **kwargs)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 889, in run
run_metadata_ptr)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1120, in _run
feed_dict_tensor, options, run_metadata)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run
options, run_metadata)
File "/home/naktinis/code/ctrl-dep-exc/venv/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Retval[0] does not have value
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 25 (18 by maintainers)
Commits related to this issue
- fixing tensorflow batch norm vs control dependencies issue https://github.com/tensorflow/tensorflow/issues/14357 — committed to ufal/neuralmonkey by jindrahelcl 6 years ago
- Include support for batch norm layer Batch normalization layer should be double checked for correctness. Currently the contrib version is being used and sets the `output_collections` option to `None`... — committed to yeahml/yeahml by JackBurdick 6 years ago
I met the same problem
When removing RNN ops or replacing
tf.layers.batch_normalizationwithtf.contrib.layers.batch_norm(inputs=inputs, updates_collections=None), all things go well. Otherwise,Retval[0] does not have valueoccur.@ebrevdo
Does the
tf.contrib.layers.batch_normworkaround work if I’m not doing the usualoptimizer.minimize()training, but usingoptimizer.compute_gradients()on 2 small batches, adding up the gradients, and then callingoptimizer.apply_gradients()on the summed gradients? Will the BN moving mean & variance get update once every small batch, every other small batch, everyapply_gradients()or never?This is important for cases where the batch size is too big for GPU memory and each batch needs to have gradients calculated separately for smaller sub-batches.
I don’t think I have authorization to re-open this issue.
I met the same problem using:
I’ve also managed to find a workaround. Saving the
update_opsoperation and forcing its evaluation explicitly through insession.run()call instead of usingwith tf.control_dependencies(update_ops)seems to mitigate the issue.@jpienaar had some plans to raise a better error when an op inside a while loop is used as a control input to an op outside the loop. I’m not sure what the status or timeline of that is at the moment.
@fchollet this seems to be enough to reproduce it https://gist.github.com/naktinis/4200f955087bb07223f1fc3bb6255528