keras: Feeding input to intermediate layer fails with Graph disconnected Exception
I am writing a pipeline that fine-tunes the pre-trained models of Keras 1.2.0. To speed it up, instead of freezing the layers I try to:
- Feed the training images once to the “frozen” part of the network and store the intermediate output to a file.
- Train iteratively the remaining network by feeding directly the intermediate output from the file.
If you don’t use data augmentation, this should yield a significant speed improvement. Unfortunately the step 2 fails with a “Graph Disconnected” exception. I tried alternative ways to do this (such as using the K.function() approach) but it still fails.
Below you will find a simple example that reproduces the problem and the error message:
import keras.applications
from keras.models import Model
from keras.layers import Input
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input
import numpy as np
# Read some random image
img = image.load_img('/path/to/image.jpg', target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
# Load a pre-trained model
model = keras.applications.resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(shape=(224, 224, 3)))
# Feed the image and get the bn_conv1 output: WORKS!
bn_conv1_model = Model(input=model.input, output=model.get_layer('bn_conv1').output)
bn_conv1_output = bn_conv1_model.predict(x)
# Feed directly the bn_conv1 output to the remaining layers: FAILS!
avg_pool_model = Model(input=Input(model.get_layer('bn_conv1').output_shape[1:]), output=model.get_layer('avg_pool').output) # This line throws exception
avg_pool_output = avg_pool_model.predict(bn_conv1_output)
The error message is: Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 1987, in init str(layers_with_complete_input)) RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor(“input_1:0”, shape=(?, 224, 224, 3), dtype=float32) at layer “input_1”. The following previous layers were accessed without issue: []
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 4
- Comments: 23 (10 by maintainers)
Graph disconnected normally means your input and output are not part of the same graph. If your input was not the variable you used to create your output, this is the error you will get.
Hi @engharat
I have not yet written a proper solution for that. The network graphs can be very complex, especially for networks that branch-out and merge a lot and this requires writting graph traversal algorithms. On the future I’ll probably do this and contribute it back to Keras but I have not done it yet.
Below I send you the latest version of the “terrible” solution that I’m using. Rest assured it is equally terrible as the previous one:
You’re making a new input layer
Input(model.get_layer('bn_conv1').output_shape[1:]
and using an old outputoutput=model.get_layer('avg_pool').output
. These are not connected in any way so of course this fails.The model outputs are tensors based on the model inputs. You can’t just use them for a different set of inputs. The simplest way to do what you want would be to build a new model using only the layers that you want.
If you want to be able to reuse the outputs from the model, you might be able to use clone and replace but that would be a long discussion you should have on the tensorflow forums.
Cheers
@datumbox 's solution really helps my work. Thank you very much. Actually it seems that the “terrible” solution is the only working solution found online treating this situation.
I think the important feature like model splitting should be given in the official keras API, maybe someday it will.
Just to clarify my use-case (similar to issue 5083), I try to fine-tune a ResNet50 as on the InceptionV3 snippet of the Keras documentation. The only problem with the original snippet is that it’s very slow. Instead of freezing the various layers one could, split the model in two parts, run the data through the first bit & persist the intermediate outputs on disk and then train/tune quickly the second part.
Here are the steps that I am following:
Part 1: Training the network with custom classification layers:
This approach speeds up the iterations by a factor of 20, comparing to freezing the layers.
Part 2: Fine-tuning the network:
This speeds up the iterations by a factor of 3, comparing to freezing the layers.
As I said earlier to split the model, I export the configuration, manipulate it and reconstruct part of the network. This is a very wasteful and poor solution. Do you think there is any low-level API that could help me achieve the same result?
Here is my “terrible” & non-generic solution for splitting a model:
this is exactly my case. and how can i solve this issue when the input is not my variable ( my original data) ? thank you
Hello Everyone,
I am trying to do similar thing. I saved the intermediate output to the disk and now using this want to train and score the remaining model. Was there any addition to Keras API for this issue? @datumbox Were you able to find another solution?
Yes I agree, you should definitely include it. If we visualize the model we will see that it makes absolutely no sense not to include the merge. 😃
My plan is to complete a less terrible solution which is more generic and allows you to split specific nodes. I’ll post it here once I have it or send a pull request.
That was the first thing I tried but the Sequential API will not work with complicated models such as ResNet50 (see below the error message).
I understand now that splitting a non-Sequential model is not an easy task since a layer on the bottom can in theory take multiple inputs from higher layers. As I am not yet familiar with the low-level API of Keras, I wrote a terrible solution that extracts the json architecture, manipilates it and reconstructs the partial model. This works and actually speeds up the model-tuning of models by a factor of 3 but it is far from an elegant and generic solution. I can share the snippet if you want.
Traceback (most recent call last): File “<stdin>”, line 2, in <module> File “/usr/local/lib/python2.7/dist-packages/keras/models.py”, line 324, in add output_tensor = layer(self.outputs[0]) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 517, in call self.add_inbound_node(inbound_layers, node_indices, tensor_indices) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 571, in add_inbound_node Node.create_node(self, inbound_layers, node_indices, tensor_indices) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0])) File “/usr/local/lib/python2.7/dist-packages/keras/layers/normalization.py”, line 128, in call self.add_updates([K.moving_average_update(self.running_mean, mean, self.momentum), File “/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py”, line 364, in moving_average_update variable, value, momentum) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py”, line 70, in assign_moving_average update_delta = _zero_debias(variable, value, decay) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py”, line 177, in _zero_debias trainable=False) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 1024, in get_variable custom_getter=custom_getter) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 850, in get_variable custom_getter=custom_getter) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 346, in get_variable validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 331, in _true_getter caching_device=caching_device, validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 632, in _get_single_variable name, “”.join(traceback.format_list(tb)))) ValueError: Variable bn_conv1_running_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File “/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py”, line 364, in moving_average_update variable, value, momentum) File “/usr/local/lib/python2.7/dist-packages/keras/layers/normalization.py”, line 128, in call self.add_updates([K.moving_average_update(self.running_mean, mean, self.momentum), File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))