plaidml: Wrong output dimension with `Dot` layers in plaid ML
Here is a simple keras neural network using a Dot
layer:
from keras.layers import Input, Dense, Dot, Reshape, Flatten
from keras.models import Model
model_in = Input((10,), name='input')
model_out = model_in
model_out = Reshape((-1, 1), name='reshape')(model_out)
model_out = Dot(axes=2, name='dot')([model_out, model_out])
model_out = Flatten(name='flatten')(model_out)
model = Model(model_in, model_out)
model.summary()
if I compile this network with tensorflow 1.14.0, the output shape is 100 (which is what I expect, as 10*10 = 100):
Layer (type) Output Shape Param # Connected to
==================================================================================================
input (InputLayer) (None, 10) 0
__________________________________________________________________________________________________
reshape (Reshape) (None, 10, 1) 0 input[0][0]
__________________________________________________________________________________________________
dot (Dot) (None, 10, 10) 0 reshape[0][0]
reshape[0][0]
__________________________________________________________________________________________________
flatten (Flatten) (None, 100) 0 dot[0][0]
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________
However, when I switch my keras backend to plaidml.keras.backend
, I get an output dimension of 100, which is clearly incorrect:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input (InputLayer) (None, 10) 0
__________________________________________________________________________________________________
reshape (Reshape) (None, 10, 1) 0 input[0][0]
__________________________________________________________________________________________________
dot (Dot) (None, 10, 10) 0 reshape[0][0]
reshape[0][0]
__________________________________________________________________________________________________
flatten (Flatten) (None, 10) 0 dot[0][0]
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________
I’m using plaidml 0.6.1
. Note that the stated output dimension of the Dot
layer is correct (10*10), but the output dimension of the Flatten
layer is incorrect
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 20 (1 by maintainers)
Thank you @SleepProgger for the example. Here’s what we’ve decided: We’ll revert the PlaidML implementation of
BatchDot
to match the Theano implementation (and docs), and create a flag that will enable TensorFlow-like behavior. If TensorFlow eventually decides to merge with Theano’s behavior, then we’ll get rid of that code altogether. However, since we already have it, we can create the flag pretty easily. I’ll work on this within the next week and provide you with an update then.Gotcha. I think the error is with the
Dot
layer. Here’s an example, where the output should be(1,2,2)
but instead is(1, 2)
:Which yields
array([[0., 0.]], dtype=float32)
What’s interesting is the keras model summary gives the correct output shape for
Dot
:dot (Dot) (None, 2, 2)
but the prediction shape is(None, 2)
instead.If I change the backend to tensorflow, I get the correct shape from the prediction for the same network: