keras: Expected 3 dimensions but got array with shape (11, 2)

I’m training a model like so:

model = Sequential()
model.add(LSTM(24, input_shape=(1200, 19), return_sequences=True, implementation=2))
model.add(TimeDistributed(Dense(1)))
model.add(AveragePooling1D())
model.add(Dense(2, activation='softmax'))
model.compile(loss=categorical_crossentropy, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, epochs=100, batch_size=6000, verbose=1, validation_data=(test_x, test_y))

When I run this on a very small dummy data set (while I’m working on getting it working), I get the following error:

ValueError: Error when checking model target: expected dense_2 to have 3 dimensions, but got array with shape (11, 2)

However, if I print the shape of train_y, it’s (11, 2), which is exactly the shape of the model output that Keras/Tensorflow is complaining about.

I’m at a loss as to why the model is expecting a 3 dimensional output when train_y is (11, 2)?

Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question on StackOverflow or join the Keras Slack channel and ask there instead of filing a GitHub issue.

Thank you!

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 1
  • Comments: 39 (2 by maintainers)

Commits related to this issue

Most upvoted comments

The problem is that you start with a three dimensional layer but never reduce the dimensionality in any of the following layers. Try adding mode.add(Flatten()) before the last Dense layer

model = Sequential()
model.add(LSTM(24, input_shape=(1200, 19), return_sequences=True, implementation=2))
model.add(TimeDistributed(Dense(1)))
model.add(AveragePooling1D())

model.add(Flatten())

model.add(Dense(2, activation='softmax'))
model.compile(loss=categorical_crossentropy, optimizer=RMSprop(lr=.01))
model.fit(train_x, train_y, epochs=100, batch_size=6000, verbose=1, validation_data=(test_x, test_y))

I am having the same problem with an encoder-decoder seq2seq model for Machine Translation. I use embedding-layers for the input to both the encoder and decoder, and I want to use one-hot encoded target-output of the decoder, but only feed it integer-tokens to save memory for storing the data-set. I don’t want the hassle of making my own data-generator-function which converts from integer-tokens to one-hot arrays.

Looking at the implementation of sparse_categorical_crossentropy() in Keras there is actually some reshaping going on there, but the doc-string doesn’t make clear what is assumed of the input/output dims and when/how reshaping is supposed to be done, so it’s impossible to know whether it is a bug or a feature we are experiencing, and how to deal with it properly.

The doc-string needs to be made more clear by someone who understands the intention of this code.

Furthermore, the doc-string needs to be “exported” somehow to the online docs because it is not shown here: https://keras.io/losses/#sparse_categorical_crossentropy

Who are we going to call to get this fixed? Ghostbusters?

I had a similar issue, which was indeed solved by adding a Flatten layer before the first Dense layer. However the docs might be misleading in this case, because documentation for the Dense layer claims that this dimensionality should be implicitly reduced:

Note: if the input to the layer has a rank greater than 2, then it is flattened prior to the initial dot product with kernel.

Or am I misreading that somehow?

Like @fuine, adding a Flatten layer solved my problem with this. But I’m additionally confused because (a) the necessity for a Flatten layer isn’t reflected in at least some of the examples (e.g., IMDb sentiment and text generation) and, (b) even more confusingly, actually works in at least the text generation example (I haven’t run the IMDb example). So ATM it seems like flattening the outputs of 3D layers before feeding them into 2D layers is sometimes necessary and sometimes not?

Same here. I was training the cats vs. dogs datasets with only Dense layers. I got to the same issue. I added the Flatten and it worked. I’d love to hear why this seemed to fixed it.

In keras/engine/input_layer.py line : 91

batch_input_shape = (batch_size,) + tuple(input_shape) So, keras computes a new input shape for batch prcessing to work. In my case, I was training a Dense Network on MNIST So, When my input shape was (728,1), keras changed it to (batch_size, 728, 1) and naturally feeding a (784, 60000) on this will throw this error of expected 3 dimensions got 2 dimensions. To solve this issue, instead of Flatten(). I changed the input size to (784,) and it worked.

Am getting this error, Error when checking input: expected lstm_40_input to have 3 dimensions, but got array with shape (1191, 26)

model = Sequential() model.add(LSTM(200, return_sequences=True, input_shape=(1191,26))) model.add(LSTM(200)) model.add(Dense(1000)) model.summary()

model.fit(input, target, nb_epoch=10, batch_size=32)

input and target is of size (1191, 26). Can anyone help me with this ?

I think the issue is caused by having return_sequences=True in the LSTM() layer. This means that we get a sequence of hidden-state-vector the size of n_neurons, for each time_step. After you’ve built your model, e.g.

model = Sequential() model.add(LSTM(n_neurons, return_sequences=True, input_shape=(n_time_steps, n_features)) model.add(Flatten()) model.add(Dense(1)) model.compile(optimizer='adam', loss='mse')

You can print model.summary().

Note the output_shapes:

  • after the LSTM() layer, it’s (None, 1, n_neurons);
  • after the Flatten() layer, it’s (None, n_neurons);
  • If return_sequences=False, after the LSTM() layer, it’s (None, n_neurons) - essentially the same as flattening.

Same issue, it is annoying that these hidden dimension conventions are not clear, see #8527

I have solve this error training my new data to predict with the same vocabulary as i have Vectorize my train_texts with his labels , i save the vocabulary in a Json, you can extract the vocabulary in your vectorizer with the function vocabulary_ and save to a variable, and then iterate over the list and convert the numpy integers to int to dump to a json file.

When you want to vectorize a string you can use again the Tfidvectorizer and pass the arguments with the vocabulary on the json, and that fits the arrays shapes.

I’m pretty sure I didn’t change anything else, but I just refreshed my GPU and restarted my notebook. You can refresh the GPU without turning everything off and on with WIN+SHIFT+CTRL+B. All of a sudden it started working.

Similar issue; apparently this has been going on for a almost two years…