keras: Expected 3 dimensions but got array with shape (11, 2)

I’m training a model like so:

model = Sequential()
model.add(LSTM(24, input_shape=(1200, 19), return_sequences=True, implementation=2))
model.add(TimeDistributed(Dense(1)))
model.add(AveragePooling1D())
model.add(Dense(2, activation='softmax'))
model.compile(loss=categorical_crossentropy, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, epochs=100, batch_size=6000, verbose=1, validation_data=(test_x, test_y))

When I run this on a very small dummy data set (while I’m working on getting it working), I get the following error:

ValueError: Error when checking model target: expected dense_2 to have 3 dimensions, but got array with shape (11, 2)

However, if I print the shape of train_y, it’s (11, 2), which is exactly the shape of the model output that Keras/Tensorflow is complaining about.

I’m at a loss as to why the model is expecting a 3 dimensional output when train_y is (11, 2)?

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 1
Comments: 39 (2 by maintainers)

Commits related to this issue

BUG FIX: Make sparse_categorical_crossentropy work with 3D outputs. (Issue: #6351) — committed to Cheukting/keras by Cheukting 6 years ago

Most upvoted comments

The problem is that you start with a three dimensional layer but never reduce the dimensionality in any of the following layers. Try adding mode.add(Flatten()) before the last Dense layer

model = Sequential()
model.add(LSTM(24, input_shape=(1200, 19), return_sequences=True, implementation=2))
model.add(TimeDistributed(Dense(1)))
model.add(AveragePooling1D())

model.add(Flatten())

model.add(Dense(2, activation='softmax'))
model.compile(loss=categorical_crossentropy, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, epochs=100, batch_size=6000, verbose=1, validation_data=(test_x, test_y))

+162

karimpedia on Apr 21, 2017

I am having the same problem with an encoder-decoder seq2seq model for Machine Translation. I use embedding-layers for the input to both the encoder and decoder, and I want to use one-hot encoded target-output of the decoder, but only feed it integer-tokens to save memory for storing the data-set. I don’t want the hassle of making my own data-generator-function which converts from integer-tokens to one-hot arrays.

Looking at the implementation of sparse_categorical_crossentropy() in Keras there is actually some reshaping going on there, but the doc-string doesn’t make clear what is assumed of the input/output dims and when/how reshaping is supposed to be done, so it’s impossible to know whether it is a bug or a feature we are experiencing, and how to deal with it properly.

The doc-string needs to be made more clear by someone who understands the intention of this code.

Furthermore, the doc-string needs to be “exported” somehow to the online docs because it is not shown here: https://keras.io/losses/#sparse_categorical_crossentropy

Who are we going to call to get this fixed? Ghostbusters?

+15

Hvass-Labs on Feb 12, 2018

I had a similar issue, which was indeed solved by adding a Flatten layer before the first Dense layer. However the docs might be misleading in this case, because documentation for the Dense layer claims that this dimensionality should be implicitly reduced:

Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

Or am I misreading that somehow?

+14

fuine on May 2, 2017

Like @fuine, adding a Flatten layer solved my problem with this. But I’m additionally confused because (a) the necessity for a Flatten layer isn’t reflected in at least some of the examples (e.g., IMDb sentiment and text generation) and, (b) even more confusingly, actually works in at least the text generation example (I haven’t run the IMDb example). So ATM it seems like flattening the outputs of 3D layers before feeding them into 2D layers is sometimes necessary and sometimes not?

+11

mattweber-bbg on Jun 15, 2017

Same here. I was training the cats vs. dogs datasets with only Dense layers. I got to the same issue. I added the Flatten and it worked. I’d love to hear why this seemed to fixed it.

maulberto3 on Aug 17, 2017

In keras/engine/input_layer.py line : 91

batch_input_shape = (batch_size,) + tuple(input_shape) So, keras computes a new input shape for batch prcessing to work. In my case, I was training a Dense Network on MNIST So, When my input shape was (728,1), keras changed it to (batch_size, 728, 1) and naturally feeding a (784, 60000) on this will throw this error of expected 3 dimensions got 2 dimensions. To solve this issue, instead of Flatten(). I changed the input size to (784,) and it worked.

coder3101 on May 11, 2018

Am getting this error, Error when checking input: expected lstm_40_input to have 3 dimensions, but got array with shape (1191, 26)

model = Sequential() model.add(LSTM(200, return_sequences=True, input_shape=(1191,26))) model.add(LSTM(200)) model.add(Dense(1000)) model.summary()

model.fit(input, target, nb_epoch=10, batch_size=32)

input and target is of size (1191, 26). Can anyone help me with this ?

priyadharshini079 on Mar 21, 2018

I think the issue is caused by having return_sequences=True in the LSTM() layer. This means that we get a sequence of hidden-state-vector the size of n_neurons, for each time_step. After you’ve built your model, e.g.

model = Sequential() model.add(LSTM(n_neurons, return_sequences=True, input_shape=(n_time_steps, n_features)) model.add(Flatten()) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse')

You can print model.summary().

Note the output_shapes:

after the LSTM() layer, it’s (None, 1, n_neurons);
after the Flatten() layer, it’s (None, n_neurons);
If return_sequences=False, after the LSTM() layer, it’s (None, n_neurons) - essentially the same as flattening.

azzurraying on Dec 12, 2018

Same issue, it is annoying that these hidden dimension conventions are not clear, see #8527

juliohm on Nov 20, 2017

I have solve this error training my new data to predict with the same vocabulary as i have Vectorize my train_texts with his labels , i save the vocabulary in a Json, you can extract the vocabulary in your vectorizer with the function vocabulary_ and save to a variable, and then iterate over the list and convert the numpy integers to int to dump to a json file.

When you want to vectorize a string you can use again the Tfidvectorizer and pass the arguments with the vocabulary on the json, and that fits the arrays shapes.

liobouchan on Feb 8, 2019

I’m pretty sure I didn’t change anything else, but I just refreshed my GPU and restarted my notebook. You can refresh the GPU without turning everything off and on with WIN+SHIFT+CTRL+B. All of a sudden it started working.

IsaacEthington on Jan 17, 2019

Similar issue; apparently this has been going on for a almost two years…

RabbitJackTrade on Sep 6, 2018