tensorflow: Weird Bug in Tf.keras.Model.Predict(x=tf.Dataset iterator)

Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 16.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
  • TensorFlow installed from (source or binary): pip
  • TensorFlow version (use command below): 1.9.0 and 1.12.0 (I am using 1.9.0 but the bug is present in 1.12.0 also)
  • Python version: 3.6
  • Bazel version (if compiling from source):
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version:
  • GPU model and memory:M60 16 GB (two 8GB GPUs)

Describe the current behavior when using tf.data.Dataset.Iterator in tf.keras.Model.predict(x=tf.data.Dataset.Iterator, steps). I am getting a weird value error:

Please provide data as a list or tuple of 2 elements - input and target pair. Received Tensor(“IteratorGetNext:0”, dtype=int64)

The above error is misleading. why does it need (X,Y) for prediction?

My Testing tf.Dataset iterator obviously does not give a (X,Y) tuple. It gives only X in batches. When I give a numpy array of X as input it works as intended. If I use the dataset iterator with eager_execution enabled: I get this error (my batch size is 2):

Please provide data as a list or tuple of 2 elements - input and target pair. Received tf.Tensor( [[ 68 5 521 … 0 0 0] [ 6705 1235757 2411 … 2804 147 13]], shape=(2, 5000), dtype=int64). We do not use the target value here.

Which makes it clear that when eager execution is enabled, Y is not used.

Moreover, why does tf.Dataset iterator need to output a tuple of (X,Y) ? when using tf.keras.Model.predict() ? Is this the expected behaviour?

NOTE: My model is a single input model not a multi input model

EDIT: I worked around the error by providing (X,Y), But the keras progress bar doesn’t seem to work with it.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (5 by maintainers)

Most upvoted comments

@wt-huang I am able to use model.fit() there isn’t any issue there. But the problem is model.predict() requires (X,Y) When I am passing a tf.dataset.iterator But if I pass a numpy it takes only X and works as intended.

i guess i am encountering this error in tensorflow==1.14.0, the model.fit is finished successfully with tf.data.Dataset input, but the model.predict keep giving error:

ValueError: Please provide model inputs as a list or tuple of 2 or 3 elements: (input, target) or (input, target, sample_weights) Received (<tf.Tensor 'IteratorGetNext_1:0' shape=(224, 224, 3) dtype=float32>, <tf.Tensor 'IteratorGetNext_1:1' shape=(224, 224, 3) dtype=float32>)

when i construct the dataset with X only, looks it is expecting Y also?

@mrry and @jvishnuvardhan. Thanks! This appears to be fixed in

tf-nightly-1.13.0-dev20190213

The following code snippet threw the error I mentioned when running with <=1.12.0 (stable version)

import tensorflow as tf

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_test = x_test.reshape((10000,28,28,1))
y_test = tf.keras.utils.to_categorical(y=y_test)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(8, kernel_size=3, activation='relu', input_shape=(28,28,1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='softmax'))
dataset = tf.data.Dataset.from_tensor_slices((x_test)) # I need to provide y_test also until version 1.12.0
dataset = dataset.batch(batch_size=10)
data = dataset.make_one_shot_iterator()
output = model.predict(x=data,steps=1000,verbose=True)

This runs as intended in nightly version for both normal and eager mode.

One more thing which I really liked was, It doesn’t look like I need to create a one_shot_iterator or initializable_iterator. I can directly pass the dataset object to model.predict which feels more natural 👍 . Before, while using tf.data with Embedding layers, I had to write a few boilerplate code to initialize lookup tables:

train_iterator = train_data.make_initializable_iterator()
valid_iterator = valid_data.make_initializable_iterator()

# initialize tables and iterators using sess from tf.keras
init_sess = tf.keras.backend.get_session()
init_sess.run(train_iterator.initializer)
init_sess.run(valid_iterator.initializer)
init_sess.run(tf.tables_initializer())

My question is do I need to run this boilerplate code in upcoming versions of tensorflow? It will be really great if it is handled internally.