tensorflow: "Invalid loop structure. Mismatched parent frames"

Computing the gradient through a batch normalization layer as implemented in keras with a batch size of None while in a tf.while loop can cause the following error:

“tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid loop structure: Mismatched parent frames for “while/while_context”: “while/while_context” vs “”. This is an internal bug, please file a bug report with instructions on how to reproduce the error”

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution: Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): Reproducible using the latest release [v1.12.0-rc1-0-g7b08198113 1.12.0-rc1] through back to at least 1.10.0.
Python version: 3.5.2
CUDA/cuDNN version: Reproducible without CUDA.
GPU model and memory: Reproducible without any GPUs.

Current Behavior: running the below script throws a really bad error.

Expected Behavior: it crashes, but not with this error.

Code to reproduce the issue

This is the minimal example that I can get to cause a crash. It isn’t “correct” code in that it shouldn’t work, but it shouldn’t crash because of mismatched parent frames. Code that I believe is actually correct also causes the crash, but is much longer.

import numpy as np
import tensorflow as tf
### MUST be keras, not from tensorflow import keras
import keras

if __name__ == "__main__":
  sess = keras.backend.get_session()

  ### I can't reproduce this with tf.layers.batch_normalization
  model = keras.models.Sequential([keras.layers.BatchNormalization(input_shape=(1,))])
  ### This next line MUST have the None, otherwise it doesn't crash
  x = tf.placeholder(tf.float32, (None, 1))

  eta = tf.zeros(tf.shape(x))
  def cond(i, _):
    return tf.less(i, 10)
  def body(i, e):
    preds = model(x+e)
    return i + 1, tf.gradients(preds, x)[0]

  _, eta = tf.while_loop(cond, body, [tf.zeros([]), eta])

  sess.run(eta, {x: np.zeros((128,1))})

Other info / logs

2018-10-22 17:21:30.132455: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
    return fn(*args)
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Invalid loop structure: Mismatched parent frames for "while/while_context": "while/while_context" vs "". The node giving this error: {{node while/gradients/while/sequential_1/batch_normalization_1/cond/batchnorm/mul_1_grad/Shape/Enter}}This is an internal bug, please file a bug report with instructions on how to reproduce the error.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "bug.py", line 26, in <module>
    sess.run(eta, {x: np.zeros((128,1))})
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
    run_metadata_ptr)
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
    run_metadata)
  File "/home/ncarlini/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Invalid loop structure: Mismatched parent frames for "while/while_context": "while/while_context" vs "". The node giving this error: node while/gradients/while/sequential_1/batch_normalization_1/cond/batchnorm/mul_1_grad/Shape/Enter (defined at bug.py:22) This is an internal bug, please file a bug report with instructions on how to reproduce the error.

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 21 (2 by maintainers)

Most upvoted comments

I encountered the same problem when using Dropout layers rather than BatchNorm. I was able to solve the problem by K.set_learning_phase(0). However, this doesn’t work with BatchNorm.

Shawn-Shan on Mar 1, 2019

Can we get some attention on this bug? It is consistently a problem with our usecase on tensorflow/cleverhans and has been present for a year and a half.

carlini on Mar 24, 2020

Last I checked it’s still an issue with the same crash.

carlini on Jan 31, 2020