tensorflow: tfcompile with tf.nn.dynamic_rnn crashes

Trying to build a C++ binary with tfcompile crashes with INVALID ARGUMENTS: Mising Exit successor to rnn/while/Switch if the graph contains tf.nn.dynamic_rnn, but it works with tf.nn.static_rnn. Why is this?

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 35 (31 by maintainers)

Most upvoted comments

It is a goal; but because XLA will JIT a graph and perform memory layout before any execution, you will have to provide a “maximum max_time” for the memory layout (even if the loop will not calculate up to this maximum size). When working with dynamic_rnn, for example, this will mean you will have to pad your batched sequence inputs to some specific max_time size(s), like 10, 50, 100, …; to ensure XLA doesn’t try to JIT compile for every possible max(sequence_length) across your minibatches. The tensorflow NMT tutorial shows how to bucket batches by sequence lengths, but does not currently pad the input max_times to the bucket boundaries.

On Wed, Dec 20, 2017 at 8:49 AM, leod notifications@github.com wrote:

As a follow-up question, is it within XLA’s goals to work with seq2seq models at some point? Will it be possible to use XLA on graphs that have variable length inputs and outputs? Right now, it seems that tfcompile does not like fetches with dimension -1.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/11275#issuecomment-353117651, or mute the thread https://github.com/notifications/unsubscribe-auth/ABtim2kBtt6Pd9UOPkYyfUgpZtuBgwDeks5tCTqkgaJpZM4ONgEv .

I tried again just now with master and still failing with INVALID ARGUMENTS: Input 0 of node encoder/rnn/while/add/y was passed int32 from encoder/rnn/while/Switch:1 incompatible with expected INVALID.

@carlthome Yeah, one way or another we’ll have support for loops in XLA AOT; we just need to work out the technical details.

We don’t have documentation about what TensorFlow ops we support via XLA AOT or JIT at the moment, but the implementation of all of the TF->XLA translations is here: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/compiler/tf2xla/kernels

And yes, thanks for mentioning unrolling. Indeed for some models, a workaround is to statically unroll your loops before feeding into XLA. An example for LSTMs is here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/tests/lstm.py https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/tests/lstm_test.py

I’m having the same issue with tfcompile and tf.nn.dynamic_rnn.

OS: Ubuntu 16.04 TF version: 1.2.1 built from source without GPU support and no XLA JIT.

Test case:

xin = tf.placeholder(tf.float32, [1, 10, 10], name='input')
cell = tf.contrib.rnn.LSTMCell(512)
out, _ = tf.nn.dynamic_rnn(cell, xin, dtype=tf.float32)
out = tf.identity(out, name='output')

sess = tf.Session()
sess.run(tf.global_variables_initializer())

saver = tf.train.Saver()
saver.save(sess, 'outgraph')

outgraph.config.pbtxt:

feed {
  id {
    node_name: "input"
  }
  shape {
    dim { size: 1 }
    dim { size: 10 }
    dim { size: 10 }
  }
}

fetch {
  id {
    node_name: "output"
  }
}

BUILD file:

load("//tensorflow/compiler/aot:tfcompile.bzl", "tf_library")

tf_library(
  name = "test_dyn_rnn",
  cpp_class = "TestDynRNN",
  graph = "outgraph.pb",
  config = "outgraph.config.pbtxt",
  freeze_checkpoint = "outgraph.ckpt"
)

Command: bazel build //tensorflow/compiler/aot/tests/test_dyn_rnn:test_dyn_rnn

The build command results in INVALID ARGUMENTS: Missing Exit successor to encoder/rnn/while/Switch. Swapping to tf.nn.static_rnn works however.

Here’s a complete example.