plaidml: Wrong output dimension with `Dot` layers in plaid ML

Here is a simple keras neural network using a Dot layer:

from keras.layers import Input, Dense, Dot, Reshape, Flatten
from keras.models import Model

model_in = Input((10,), name='input')
model_out = model_in
model_out = Reshape((-1, 1), name='reshape')(model_out)
model_out = Dot(axes=2, name='dot')([model_out, model_out])
model_out = Flatten(name='flatten')(model_out)
model = Model(model_in, model_out)
model.summary()

if I compile this network with tensorflow 1.14.0, the output shape is 100 (which is what I expect, as 10*10 = 100):

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input (InputLayer)              (None, 10)           0                                            
__________________________________________________________________________________________________
reshape (Reshape)               (None, 10, 1)        0           input[0][0]                      
__________________________________________________________________________________________________
dot (Dot)                       (None, 10, 10)       0           reshape[0][0]                    
                                                                 reshape[0][0]                    
__________________________________________________________________________________________________
flatten (Flatten)               (None, 100)          0           dot[0][0]                        
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________

However, when I switch my keras backend to plaidml.keras.backend, I get an output dimension of 100, which is clearly incorrect:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input (InputLayer)              (None, 10)           0                                            
__________________________________________________________________________________________________
reshape (Reshape)               (None, 10, 1)        0           input[0][0]                      
__________________________________________________________________________________________________
dot (Dot)                       (None, 10, 10)       0           reshape[0][0]                    
                                                                 reshape[0][0]                    
__________________________________________________________________________________________________
flatten (Flatten)               (None, 10)           0           dot[0][0]                        
==================================================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
__________________________________________________________________________________________________

I’m using plaidml 0.6.1. Note that the stated output dimension of the Dot layer is correct (10*10), but the output dimension of the Flatten layer is incorrect

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 20 (1 by maintainers)

Most upvoted comments

Thank you @SleepProgger for the example. Here’s what we’ve decided: We’ll revert the PlaidML implementation of BatchDot to match the Theano implementation (and docs), and create a flag that will enable TensorFlow-like behavior. If TensorFlow eventually decides to merge with Theano’s behavior, then we’ll get rid of that code altogether. However, since we already have it, we can create the flag pretty easily. I’ll work on this within the next week and provide you with an update then.

Gotcha. I think the error is with the Dot layer. Here’s an example, where the output should be (1,2,2) but instead is (1, 2):

from keras.layers import Input, Dot
from keras.models import Model
import numpy as np

model_in = Input((2,1), name='input')
model_out = Dot(axes=2, name='dot')([model_in, model_in])
model = Model(model_in, model_out)
x = np.zeros((1,2,1))
model.predict(x)

Which yields array([[0., 0.]], dtype=float32)

What’s interesting is the keras model summary gives the correct output shape for Dot: dot (Dot) (None, 2, 2) but the prediction shape is (None, 2) instead.

If I change the backend to tensorflow, I get the correct shape from the prediction for the same network:

array([[[0., 0.],
        [0., 0.]]], dtype=float32)