keras: Bug in masking of output in K.rnn(..., unroll=False) (for tensorflow and cntk)

Summary Outputs are not masked correctly in tensorflow_backend.rnn(..., unroll=False). The issue is that states[0] is assumed to be equal to the output of the step_function in this line (not so in other backends or with unroll=True). This holds for the built-in RNNCells, which is the reason the bug has gone undetected. Especially since the introduction of output_size in the RNNCell it is clear that this should not generally be assumed.

Implications

RNN returns the wrong output when mask is used and the output is not equal to states[0] but has same size - i.e. a quiet error:

class Cell(keras.layers.Layer):

    def __init__(self):
        self.state_size = None
        self.output_size = None
        super(Cell, self).__init__()

    def build(self, input_shape):
        self.state_size = input_shape[-1]
        self.output_size = input_shape[-1]

    def call(self, inputs, states):
        return inputs, [s + 1 for s in states]

x = Input((3, 1), name="x")
x_masked = Masking()(x)
s_0 = Input((1,), name="s_0")
y, s = recurrent.RNN(Cell(),
                     return_state=True,
                     unroll=False)(x_masked, initial_state=s_0)
model = Model([x, s_0], [y, s])
model.compile(optimizer='sgd', loss='mse')

# last time step masked
x_arr = np.array([[[1.],[2.],[0.]]])
s_0_arr = np.array([[10.]])
y_arr, s_arr = model.predict([x_arr, s_0_arr])

# 1 is added to initial state two times
assert_allclose(s_arr, s_0_arr + 2)
# expect last output to be the same as last output before masking
assert_allclose(y_arr, x_arr[:, 1, :])  # Fails!

Gives:

       AssertionError: 
       Not equal to tolerance rtol=1e-07, atol=0
       
       (mismatch 100.0%)
        x: array([12.], dtype=float32)
        y: array([2.])

Exception is raised when trying to apply an RNN with a cell which output_size != state_size[0]

class Cell(keras.layers.Layer):

    def __init__(self):
        self.state_size = None
        self.output_size = None
        super(Cell, self).__init__()

    def build(self, input_shape):
        self.state_size = input_shape[-1]
        self.output_size = input_shape[-1] * 2

    def call(self, inputs, states):
        return keras.layers.concatenate([inputs]*2), [s + 1 for s in states]

x = Input((3, 1), name="x")
x_masked = Masking()(x)
s_0 = Input((1,), name="s_0")
y, s = recurrent.RNN(Cell(),
                     return_state=True,
                     unroll=False)(x_masked, initial_state=s_0)  # Fails!

Gives:

ValueError: Dimension 1 in both shapes must be equal, but are 2 and 1. Shapes are [?,2] and [?,1]. for 'rnn_1/while/Select' (op: 'Select') with input shapes: [?,?], [?,2], [?,1].

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 17 (17 by maintainers)

Most upvoted comments

I’ve put minor because you’ve already made a PR for this bug. But of course this is not something to overlook. Let me change the tag. Also thanks for your quick reaction and PR on this bug!

gabrieldemarmiesse on Oct 24, 2018