keras: Error when checking model target with sparse_categorical_crossentropy in 1.0.4

The following script fails with Keras 1.0.4, but worked with 1.0.3:

from keras.layers import Dense, Activation
from keras.models import Sequential
model = Sequential([
    Dense(32, input_dim=2),
    Activation('relu'),
    Dense(10),
    Activation('softmax'),
])
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd')
model.fit([[0,1], [1,1], [1,0]], [1,2,3])

gives the following exception:

Exception: Error when checking model target: expected activation_6 to have shape (None, 10) but got array with shape (3, 1)

The check doesn’t seem to take into account the sparse categorical entropy loss (which should only take one integer target per training example). This has been tested with both Tensorflow and Theano backend.

Please make sure that the boxes below are checked before you submit your issue. Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 6
  • Comments: 20 (4 by maintainers)

Most upvoted comments

@damianhinch In the model above, the output at the end is 10-dimensional (Dense(output_dim=10)). Because of the softmax layer this can be interpreted as a probability distribution over the 10 classes I am trying to predict (e.g. 10 digits if using MNIST). A typical output would then look like [0.1, 0.05, 0.3, ... , 0.02].

The problem then is that I am trying to fit this output, let’s call it y_pred with y_train. But each example in y_train is just a digit (e.g. 2), so it has shape (1,) whereas my y_pred has shape (10,), so there is a mismatch.

There are two ways you can solve this, either you can encode y_train as a one-hot vector, i.e. a vector which is 1 at the digit represented and 0 otherwise. So for 2 this would be [0, 0, 1, 0, 0, ..., 0]. You can use this one hot encoding to fit the model with categorical_crossentropy as above.

Alternatively you can use sparse_categorical_crossentropy which will take care of this transformation for you internally. The word sparse is used here because 2 is a sparse representation of [0, 0, 1, 0, ..., 0] in the sense that it refers to the index of the non zero element.

whats up @EmilienDupont try sparse_categorical_crossentropy and reshaping y_train = y_train.reshape((-1, 1))

@Emerson (None, 10) is just a placeholder for an array with an unknown number of rows and 10 columns. During training this would typically take a shape (<batch_size>, 10) depending on the size of your batch.

Hi, I got the same error below; ValueError: Error when checking target: expected sequential_1 to have 4 dimensions, but got array with shape (1481, 3). The input information is here like; X_train=(1481, 64, 64, 3) / y_train=(1481, 3) and y_train is like categorized array of [[0 1 0]\n [1 0 0]\n ....]

and model function is below;

image_size = (64, 64) input_image = Input(shape=(*image_size, 3)) base_model = VGG16(input_tensor=input_image, include_top=False) top_model = Sequential() top_model.add(Dense(3, input_shape=base_model.output_shape[1:], activation=“softmax”)) model = Model(inputs=base_model.input, outputs=top_model(base_model.output))

I understand that this error is not as expected as the shape of the tensor of y_train, but I did not know how to solve this. Would you give me some advice if possible?

I had this same error before! I found out that the last layer is treated as the output layer, so make sure to change the last layer to Dense(OUTPUT_SIZE).

@EmilienDupont could you explain what you did? I also have the same problem.