tensorflow: layer.output raises AttributeError because inbound nodes lost after call to activation function
System information
- custom code
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ubuntu 18.04
- TensorFlow installed from (source or binary): pip install tensorflow-gpu
- TensorFlow version (use command below):
v2.0.0-rc2-26-g64c3d38 2.0.0 and v2.0.0-beta0-16-g1d91213fe7 2.0.0-beta1
- Python version: 3.6.9
Describe the current behavior calling layer.output on a keras layer that is called on the output of an activation function does not setup the inbound nodes properly and so one cannot call the layer.output method.
Describe the expected behavior layer.output should return the output tensor
Code to reproduce the issue
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.dense0 = tf.keras.layers.Dense(10, name='dense0', input_shape=(5, 5, 1))
self.dense1 = tf.keras.layers.Dense(10, name='dense1')
self.dense2 = tf.keras.layers.Dense(10, name='dense2')
def call(self, x):
x = self.dense0(x)
# if you use this line it works
x = tf.keras.layers.ReLU()(x)
x = self.dense1(x)
print('correct:', self.dense1.inbound_nodes)
# if you use this line it doesn't work
relu = tf.keras.activations.get('relu')
x = relu(x)
x = self.dense2(x)
print('incorrect:', self.dense2.inbound_nodes)
return x
def main():
my_model = MyModel()
inp = tf.keras.Input(shape=(5, 5, 1))
out = my_model(inp)
my_model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy')
for l in my_model.layers:
try:
print(l.output)
except AttributeError:
print('EXCEPTION: {}.output raises attribute error'.format(l.name))
if __name__=='__main__':
main()
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
UPDATED: to include input_shape which does not solve the problem.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 15 (3 by maintainers)
@chahld , When defining a network in Keras, the first layer added needs to have input_shape added, please check the link for more information. Also refer this example.Thanks!
@chahld @zzj0402 I solved this issue with a simple trick
and this is the output:
Basically, I did add a build method to your model and explicitly computed the output of each layer. You can now see the shapes of each layer in the summary. note that if you don’t pass tf.keras.Input(shape=(5, 5, 1)) to the model and pass tensor directly for example tf.random.uniform([5, 5, 1]) you need to get raide of the range selection in input_shape as follow:
I hope that helps good luck This is the reproduced version in TF v2.5 but it should work on all TF v2.x versions, please find the gist here thanks
@oanush,
I am not using the Sequential Model. I discovered the problem while subclassing a layer where I needed to call tf.stop_gradient. [Note: Subclassing a layer does not require the input_shape parameter. In my example I’m calling the model on a tf.keras.Input which provides the appropriate input shape. Just to prove this to myself I’ve ammended the code in my original post, it still fails.]
TL;DNR: The inbound nodes problem happens whenever you are subclassing (either Model or Layer) and you use functions instead of classes for some of the steps in the call function. You can work around this problem by wrapping any tensorflow function inside a tf.keras.layers.Lambda.
Detailed comments:
maybe tf.keras.activations.get should return the ReLU layer, not the relu function.
so long as you use only Keras Layers in the model, the inbound nodes are updated correctly.
If you use the functional interface outside of a keras.Layer, the error does not happen. So there is something about the context inside of Layer.call method that is not working properly. (see code below)
the fact that the node structure is updated correctly in the functional interface makes me think that the layer.call problem is actually a bug.
Workaround: My original problem happened because I needed to call tf.stop_gradient. There is no corresponding keras layer for this, but you can get around this by wrapping it inside a Lambda layer:
Addendum:
See my original post for code to reproduce the problem.
Here is an example using the functional interface that works: