keras: Can't get a simple XOR problem network to work, answer always array([0])

I am trying to implement a XOR-problem solving network and can’t seem to get it to work. Here’s my code:

model = Sequential()
model.add(Dense(2,2))
model.add(Activation('sigmoid'))
model.add(Dense(2,1))
model.add(Activation('softmax'))
X = numpy.array([[0,0],[0,1],[1,0],[1,1]])
y = numpy.array([[0],[1],[1],[0]])
model.compile(loss='categorical_crossentropy', optimizer='sgd')
model.fit(X, y, nb_epoch=5, batch_size=32)

No matter what input I try, the answer to predict_classes is always array([0]), with predict_proba result always being array([[ 1.]]).

I have tried other setups, with tanh activation, loss='mean_absolute_error', optimizer='rmsprop', nb_epoch=20, batch_size=16, but there was no difference.

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 22 (6 by maintainers)

Commits related to this issue

Most upvoted comments

If I weren’t interested in learning how to use Keras, I wouldn’t have raised this issue. If you don’t have the time to help, then you shouldn’t spend it writing a passive-aggressive retort either.

Here’s an XOR net that works with the new API:

import numpy as np
from keras.models import Sequential
from keras.layers.core import Activation, Dense
from keras.optimizers import SGD

X = np.zeros((4, 2), dtype='uint8')
y = np.zeros(4, dtype='uint8')

X[0] = [0, 0]
y[0] = 0
X[1] = [0, 1]
y[1] = 1
X[2] = [1, 0]
y[2] = 1
X[3] = [1, 1]
y[3] = 0

model = Sequential()
model.add(Dense(2, input_dim=2))
model.add(Activation('sigmoid'))
model.add(Dense(1))
model.add(Activation('sigmoid'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd, class_mode="binary")

history = model.fit(X, y, nb_epoch=10000, batch_size=4, show_accuracy=True, verbose=0)

print model.predict(X)

I don’t have the time to get your code working, but in any case solving XOR with Keras sounds like using a Merlin rocket engine to make a grilled cheese sandwich.

If you are genuinely interested in learning how to use Keras, I recommend you check out the examples page, the examples folder (6 scripts solving real problems), and the intro “30 seconds to Keras”.

If you are genuinely seeking a XOR operator, I recommend you use the ^ Python operator.

XOR example:

from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
import numpy as np

np.random.seed(100)

model = Sequential()
model.add(Dense(2, 8))
model.add(Activation('relu'))
model.add(Dense(8, 1))
model.add(Activation('sigmoid'))
X = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")
y = np.array([[0],[1],[1],[0]], "float32")
sgd = SGD(lr=0.1)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X, y, nb_epoch=1000, batch_size=1)

print(model.predict_proba(X))
[[ 0.00262257]
 [ 0.99859906]
 [ 0.9986052 ]
 [ 0.00181695]]

Don’t throw XOR to the dust bin so fast. It is still a great example for certain instances. I use XOR as an example in a class to show my students how I can extract the weights from Keras and calculate the XOR output on the classroom whiteboard. I do not have a big enough whiteboard for MINST!

I have several examples of XOR and how to extract/calculate the weights on a very small (2-hidden layer network).

https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class03_tensor_flow.ipynb

This worked for me

from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adam
import numpy as np

np.random.seed(100)

model = Sequential()
model.add(Dense(units = 3, input_dim=2, activation = 'tanh'))
model.add(Dense(units = 1, activation = 'sigmoid'))
X = np.array([[0,0],[0,1],[1,0],[1,1]], "float32")
y = np.array([[0],[1],[1],[0]], "float32")
adam = Adam(lr=0.1)
model.compile(loss='binary_crossentropy', optimizer=adam)
model.fit(X, y, nb_epoch=200, batch_size=4,verbose=1)

print(model.predict_classes(X))

4/4 [==============================] - 0s - loss: 0.0041
Epoch 199/200
4/4 [==============================] - 0s - loss: 0.0041
Epoch 200/200
4/4 [==============================] - 0s - loss: 0.0040
4/4 [==============================] - 0s
[[0]
 [1]
 [1]
 [0]]