tensorflow: Keras doesn't allow tf.data validation without validation_steps
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac 10.14.4, Ubuntu 18.01
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 1.14.1.dev20190524 | 1.14.0rc0 | 1.14.0
- Python version: 3.6.8
- CUDA/cuDNN version: -
- GPU model and memory: -
Describe the current behavior
Using tf.data
as validation_data
without defining validation_steps
fails with TypeError: 'DatasetV1Adapter' object does not support indexing
. Using tf.data
without steps_per_epoch
works as expected when using it as training data instead.
Describe the expected behavior
I think the behaviour of training data and validation data in Keras model.fit
should be consistent. This would make Keras a lot easier to use together with tf.data
because it gets rid of the need for defining a exact number of steps.
Code to reproduce the issue
import tensorflow as tf
import tensorflow_datasets as tfds
train, test = tfds.load(name="mnist", split=[tfds.Split.TRAIN, tfds.Split.TEST], as_supervised=True)
def scale(image, label):
return tf.cast(image, tf.float32) / 255, label
model = tf.keras.Sequential(
[
tf.keras.layers.Conv2D(32, 3, activation="relu", input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax"),
]
)
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
model.fit(
train.batch(256),
validation_data=test.batch(256),
)
Other info / logs
File "test.py", line 24, in <module>
epochs=10,
File "/Users/lukasgeiger/miniconda3/envs/tf-nightly/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 644, in fit
use_multiprocessing=use_multiprocessing)
File "/Users/lukasgeiger/miniconda3/envs/tf-nightly/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 615, in fit
steps_name='steps_per_epoch')
File "/Users/lukasgeiger/miniconda3/envs/tf-nightly/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 145, in model_iteration
_print_train_info(inputs, val_inputs, steps_per_epoch, verbose)
File "/Users/lukasgeiger/miniconda3/envs/tf-nightly/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 450, in _print_train_info
hasattr(inputs[0], 'shape') and hasattr(val_inputs[0], 'shape')):
TypeError: 'DatasetV1Adapter' object does not support indexing
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 9
- Comments: 29 (20 by maintainers)
Sorry for opening this after almost a year, but from docstring of model.fit():
steps_per_epoch
: Integer or None. Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. When training with input tensors such as TensorFlow data tensors, the default None is equal to the number of samples in your dataset divided by the batch size, or 1 if that cannot be determined. If x is a tf.data dataset, and ‘steps_per_epoch’ is None, the epoch will run until the input dataset is exhausted. This argument is not supported with array inputs.Why not do the same (If x is a tf.data dataset, and ‘steps_per_epoch’ is None, the epoch will run until the input dataset is exhausted) for validation data?
The docstring is a bit confusing
validation_data
: Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. validation_data will override validation_split. validation_data could be:However
validation_steps
: Only relevant if validation_data is provided and is a tf.data dataset. Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch. If validation_data is a tf.data dataset and ‘validation_steps’ is None, validation will run until the validation_data dataset is exhausted.For 1.15 during
model.fit()
when validation data istf.data
andvalidation_steps
isNone
the following error raises:ValueError: When using data tensors as input to a model, you should specify the `steps_per_epoch` argument.
But this is not raised during
model.evaluate()
withsteps
=None
This is fixed with latest tf-nightly version ‘1.15.0-dev20190808’
Yup, can reproduce here as well with
1.14.0
.Suboptimal workarounds are:
verbose=0
steps_per_epoch
to something that is notNone
validation_data
@lgeiger I could reproduce the issue when i select “cpu”. If I select “gpu” as shown in my gist, there is no error. Thanks!
@lgeiger I ran it in
1.14.1.dev20190524
and don’t see any error. Here is the gist.It will be helpful If you can create a gist and share. Thanks!