tensorflow: Cannot use dict base datasets with keras.Model.fit.

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.0.0
  • Python version: 3.6.4
  • CUDA/cuDNN version: 10.1
  • GPU model and memory: 7.6.2

In order to make both dataset and keras model have good structures, I create a dataset and a vanilla model like this.

# dataset is something like <BatchDataset shapes: ({input: (None, 100)}, {output: (None, 10)}), types: ({input: tf.float32}, {output: tf.float32})>

# subscale model is something like
class VanillaModel(Model):

  def __init__(self, num_units, **kwargs):
    super(VanillaModel, self).__init__(**kwargs)
    self.num_units = num_units

    # One linear projection layer.
    self.dense_proj = tf.keras.layers.Dense(num_units, activation='relu')

  def call(self, features):
    """Forward pass."""
    output = self.dense_proj(features['input'])
    return {
        'output': output
    }

When I use the dict based dataset (from tfds) with keras.Model.fit, the first call will cause expection as

# Compile model using dict with same keys.
model.compile('adam', {'output': 'mse'})

# The errors
 File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1248, in cast_if_floating_dtype_and_mismatch
    if target.dtype != out.dtype:
AttributeError: 'str' object has no attribute 'dtype'

I checked the code and found that, when dict is passed, iterating through zip(targets, outputs) will just get the keys of the dict, so the string keys have no dtyle. (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/engine/training_utils.py#L1246) image

So how can I use dict based dataset and model with keras.Model.fit?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

Thanks for the issue all! This is now fixed in the latest tf-nightlym you can use any arbitrary nested structure of data with subclassed Models.

Code below works (note I had to add a call to dataset.batch, this is needed to create batches of data when using from_tensor_slices):

class VanillaModel(tf.keras.Model):

  def __init__(self, num_units, **kwargs):
    super(VanillaModel, self).__init__(**kwargs)
    self.num_units = num_units

    # One linear projection layer.
    self.dense_proj1 = tf.keras.layers.Dense(num_units, activation='relu')
    self.dense_proj2 = tf.keras.layers.Dense(num_units, activation='relu')

  def call(self, features):
    """Forward pass."""
    proj1_output = self.dense_proj1(features['input'])
    proj2_output = self.dense_proj2(features['input'])
    return {
        'proj1_output': proj1_output,
        'proj2_output': proj2_output
    }

input_tensor = np.random.normal(size=(50, 32)).astype(np.float32)
output_tensor1 = np.random.normal(size=(50, 16)).astype(np.float32)
output_tensor2 = np.random.normal(size=(50, 16)).astype(np.float32)

dataset = tf.data.Dataset.from_tensor_slices(({'input': input_tensor}, {'proj1_output': output_tensor1, 'proj2_output': output_tensor2}))
dataset = dataset.batch(10)  # This needs to be called to create batches of data.
model = VanillaModel(16)

model.compile('adam', {'proj1_output': 'mse', 'proj2_output': 'mae'})
model.fit(dataset)

I faced same issue. My workaround is not using the subclass way. When I change to tf.keras.Model(inputs=…,outputs=…), the problem solved.

Or before model.complie, add two lines: inputs = tf.keras.layer.Input(shape=(…)) model(inputs) # you need to run the model once. model.build() doen’t help.

In fact I found there’s quite some limitation on subclass model, and there’s no good guidline about how to avoid these limitations.Hope tensorflow can improve the usability of subclass model. I really prefer subclass model than sequence model becuase that’s the way how we can easily debug the model with eager.

@jgmakin When using Model.fit, everything runs inside a graph anyway (a tf.function graph) so performance should be good

For more, check out this guide

If I use tf.feature_column.categorical_column_with_vocabulary_list I was able to specify the column name in keras. But I faced the issue when I called model_to_estimator API to convert back to tensorflow model as the converted model was expecting inputs like input1,input2,... as you mentioned