tensorflow: Inconsistent behavior of tf.function when using autograph

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS X 10.13.6
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): VERSION=“1.13.0-dev20181226” (this is the TF 2.0-preview) GIT_VERSION=“b’v1.12.0-5133-gc343196842’”
Python version: 3.6.6
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): N/A
CUDA/cuDNN version: N/A
GPU model and memory: N/A

Describe the current behavior tf.function converts almost identical functions into completely different graphs. It seems like a bug, but perhaps it’s just a bit too complicated for me. If it’s working as expected, then I think the documentation really needs to be expanded, with detailed examples and clear guidelines.

Describe the expected behavior All the following functions should return almost identical graphs.

Code to reproduce the issue

import tensorflow as tf

@tf.function
def foo1(x):
    for i in range(10):
        x = x + 1
    return x

@tf.function
def foo2(x):
    for i in range(tf.constant(10)):
        x = x + tf.constant(1)
    return x

@tf.function
def foo3():
    x = 0
    for i in range(10):
        x = x + 1
    return x

@tf.function
def foo4():
    x = tf.constant(0)
    for i in range(tf.constant(10)):
        x = x + tf.constant(1)
    return x

def _print_sub_ops(op, indent=0):
    """Recursively print an op's inputs"""
    print("  "*indent, op.name)
    for ts in op.inputs:
        _print_sub_ops(ts.op, indent + 1)

def print_graph(func, *args):
    print(func.__name__)
    ops = func.get_concrete_function(*args).graph.get_operations()
    _print_sub_ops(ops[-1])   # or just print(ops) if you prefer
    print()

print_graph(foo1, tf.constant(0))
print_graph(foo2, tf.constant(0))
print_graph(foo3)
print_graph(foo4)

Other info / logs

Below is the output of this program. Notice that:

foo1 is horrible, autograph did not generate a tf.while_loop. Imagine a loop with 10000 iterations, it would just blow up.
foo2 is pretty good, but it’s odd that I have to wrap every integer into a tf.constant.
foo3 is perfect, it even reduced the whole graph to a single constant, congrats.
foo4 is virtually identical to foo2, which is pretty good, but why didn’t it get the same magic as foo3?

foo1
 Identity
   add_9
     add_8
       add_7
         add_6
           add_5
             add_4
               add_3
                 add_2
                   add_1
                     add
                       x
                       add/y
                     add_1/y
                   add_2/y
                 add_3/y
               add_4/y
             add_5/y
           add_6/y
         add_7/y
       add_8/y
     add_9/y

foo2
 Identity
   while/Identity_2
     while
       while/loop_counter
       Const_1
       x
       maximum_iterations
       range
         range/start
         Maximum
           Const
           Maximum/y
         range/delta

foo3
 Identity
   Const

foo4
 Identity
   while/Identity_2
     while
       while/loop_counter
       Const_2
       Const
       maximum_iterations
       range
         range/start
         Maximum
           Const_1
           Maximum/y
         range/delta

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 16 (9 by maintainers)

Most upvoted comments

Thank you Aurélien for your input. The insights you gathered are most valuable. Please continue to send us any new data points you might gather.

It would seem that new users tend to have different expectations compared to “veteran” developers who wrote lots of TF graph code.

One thing that becomes obvious is that mechanisms which reliably avoid ambiguity would be useful regardless of what the defaults would be. An extreme version of such a mechanism would be an “all-graph” mode, in the lines of what you suggested: @tf.function(autograph=STRICT) where everything ran in graph, and doing anything outside the graph required special overrides.

As a side note, we recently pushed a change where a construct like range(tf.constant(n)) would be unsupported and raise an error. Although, that still leaves us with unexpected behavior in the case when the argument to the function was a Python value, or when users wrote range(n) out of sheer convenience, fully expecting the loop to be in-graph. The hope was that this would drive the habit of using either range to always statically unroll, or tf.range to always run in graph.

mdanatg on Feb 12, 2019

I’m looking at https://github.com/tensorflow/docs/blob/master/site/en/r2/guide/autograph.ipynb and realizing that the latest version doesn’t mention this range/tf.range thing. I’ll add a note to that colab.

brilee on Jan 16, 2019