keras: Bug in masking of output in K.rnn(..., unroll=False) (for tensorflow and cntk)
Summary
Outputs are not masked correctly in tensorflow_backend.rnn(..., unroll=False)
. The issue is that states[0]
is assumed to be equal to the output
of the step_function
in this line (not so in other backends or with unroll=True
). This holds for the built-in RNNCells, which is the reason the bug has gone undetected. Especially since the introduction of output_size
in the RNNCell it is clear that this should not generally be assumed.
Implications
RNN
returns the wrong output when mask is used and theoutput
is not equal tostates[0]
but has same size - i.e. a quiet error:
class Cell(keras.layers.Layer):
def __init__(self):
self.state_size = None
self.output_size = None
super(Cell, self).__init__()
def build(self, input_shape):
self.state_size = input_shape[-1]
self.output_size = input_shape[-1]
def call(self, inputs, states):
return inputs, [s + 1 for s in states]
x = Input((3, 1), name="x")
x_masked = Masking()(x)
s_0 = Input((1,), name="s_0")
y, s = recurrent.RNN(Cell(),
return_state=True,
unroll=False)(x_masked, initial_state=s_0)
model = Model([x, s_0], [y, s])
model.compile(optimizer='sgd', loss='mse')
# last time step masked
x_arr = np.array([[[1.],[2.],[0.]]])
s_0_arr = np.array([[10.]])
y_arr, s_arr = model.predict([x_arr, s_0_arr])
# 1 is added to initial state two times
assert_allclose(s_arr, s_0_arr + 2)
# expect last output to be the same as last output before masking
assert_allclose(y_arr, x_arr[:, 1, :]) # Fails!
Gives:
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
(mismatch 100.0%)
x: array([12.], dtype=float32)
y: array([2.])
- Exception is raised when trying to apply an RNN with a cell which
output_size != state_size[0]
class Cell(keras.layers.Layer):
def __init__(self):
self.state_size = None
self.output_size = None
super(Cell, self).__init__()
def build(self, input_shape):
self.state_size = input_shape[-1]
self.output_size = input_shape[-1] * 2
def call(self, inputs, states):
return keras.layers.concatenate([inputs]*2), [s + 1 for s in states]
x = Input((3, 1), name="x")
x_masked = Masking()(x)
s_0 = Input((1,), name="s_0")
y, s = recurrent.RNN(Cell(),
return_state=True,
unroll=False)(x_masked, initial_state=s_0) # Fails!
Gives:
ValueError: Dimension 1 in both shapes must be equal, but are 2 and 1. Shapes are [?,2] and [?,1]. for 'rnn_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,2], [?,1].
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (17 by maintainers)
I’ve put minor because you’ve already made a PR for this bug. But of course this is not something to overlook. Let me change the tag. Also thanks for your quick reaction and PR on this bug!