keras: Feeding input to intermediate layer fails with Graph disconnected Exception

I am writing a pipeline that fine-tunes the pre-trained models of Keras 1.2.0. To speed it up, instead of freezing the layers I try to:

Feed the training images once to the “frozen” part of the network and store the intermediate output to a file.
Train iteratively the remaining network by feeding directly the intermediate output from the file.

If you don’t use data augmentation, this should yield a significant speed improvement. Unfortunately the step 2 fails with a “Graph Disconnected” exception. I tried alternative ways to do this (such as using the K.function() approach) but it still fails.

Below you will find a simple example that reproduces the problem and the error message:

import keras.applications
from keras.models import Model
from keras.layers import Input
from keras.preprocessing import image
from keras.applications.imagenet_utils import preprocess_input
import numpy as np

# Read some random image
img = image.load_img('/path/to/image.jpg', target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Load a pre-trained model
model = keras.applications.resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(shape=(224, 224, 3)))

# Feed the image and get the bn_conv1 output: WORKS!
bn_conv1_model = Model(input=model.input, output=model.get_layer('bn_conv1').output)
bn_conv1_output = bn_conv1_model.predict(x)

# Feed directly the bn_conv1 output to the remaining layers: FAILS!
avg_pool_model = Model(input=Input(model.get_layer('bn_conv1').output_shape[1:]), output=model.get_layer('avg_pool').output) # This line throws exception
avg_pool_output = avg_pool_model.predict(bn_conv1_output)

The error message is: Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 1987, in init str(layers_with_complete_input)) RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor(“input_1:0”, shape=(?, 224, 224, 3), dtype=float32) at layer “input_1”. The following previous layers were accessed without issue: []

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 4
Comments: 23 (10 by maintainers)

Most upvoted comments

Graph disconnected normally means your input and output are not part of the same graph. If your input was not the variable you used to create your output, this is the error you will get.

+37

bstriner on Jan 20, 2017

Hi @engharat

I have not yet written a proper solution for that. The network graphs can be very complex, especially for networks that branch-out and merge a lot and this requires writting graph traversal algorithms. On the future I’ll probably do this and contribute it back to Keras but I have not done it yet.

Below I send you the latest version of the “terrible” solution that I’m using. Rest assured it is equally terrible as the previous one:

def split(model, start, end):
    confs = model.get_config()
    kept_layers = set()
    for i, l in enumerate(confs['layers']):
        if i == 0:
            confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
            if i != start:
                confs['layers'][0]['name'] += str(random.randint(0, 100000000)) # rename the input layer to avoid conflicts on merge
                confs['layers'][0]['config']['name'] = confs['layers'][0]['name']
        elif i < start or i > end:
            continue
        kept_layers.add(l['name'])
    # filter layers
    layers = [l for l in confs['layers'] if l['name'] in kept_layers]
    layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
    # set conf
    confs['layers'] = layers
    confs['input_layers'][0][0] = layers[0]['name']
    confs['output_layers'][0][0] = layers[-1]['name']
    # create new model
    submodel = Model.from_config(confs)
    for l in submodel.layers:
        orig_l = model.get_layer(l.name)
        if orig_l is not None:
            l.set_weights(orig_l.get_weights())
    return submodel

+13

datumbox on Jul 22, 2017

You’re making a new input layer Input(model.get_layer('bn_conv1').output_shape[1:] and using an old output output=model.get_layer('avg_pool').output. These are not connected in any way so of course this fails.

The model outputs are tensors based on the model inputs. You can’t just use them for a different set of inputs. The simplest way to do what you want would be to build a new model using only the layers that you want.

m = Sequential()
for layer in model.layers[42:69]:
  m.add(layer)

If you want to be able to reuse the outputs from the model, you might be able to use clone and replace but that would be a long discussion you should have on the tensorflow forums.

Cheers

bstriner on Jan 20, 2017

@datumbox 's solution really helps my work. Thank you very much. Actually it seems that the “terrible” solution is the only working solution found online treating this situation.

I think the important feature like model splitting should be given in the official keras API, maybe someday it will.

jaewooklee93 on Dec 18, 2018

Just to clarify my use-case (similar to issue 5083), I try to fine-tune a ResNet50 as on the InceptionV3 snippet of the Keras documentation. The only problem with the original snippet is that it’s very slow. Instead of freezing the various layers one could, split the model in two parts, run the data through the first bit & persist the intermediate outputs on disk and then train/tune quickly the second part.

Here are the steps that I am following:

Part 1: Training the network with custom classification layers:

Fetch the bottom network of ResNet50 and build a Sequential top model for my own classes.
Feed the training data to the bottom network and persist their intermediate output in a file.
Feed directly the cached intermediate results to the top network and run lots of very quick iterations.
Join the bottom and top networks into a single model.

This approach speeds up the iterations by a factor of 20, comparing to freezing the layers.

Part 2: Fine-tuning the network:

Split the above model in two parts: layers from 0 to 140 and from 141 to the end.
Feed the original training data through the first part and store the outputs in a file.
Feed the cached outputs to the second part and run lots of quick iterations to fine tune it.
Join the 2 parts into a single model.

This speeds up the iterations by a factor of 3, comparing to freezing the layers.

As I said earlier to split the model, I export the configuration, manipulate it and reconstruct part of the network. This is a very wasteful and poor solution. Do you think there is any low-level API that could help me achieve the same result?

Here is my “terrible” & non-generic solution for splitting a model:

def split_model(model, start, end):
    confs = model.get_config()
    weights = {l.name:l.get_weights() for l in model.layers}
    # split model
    kept_layers = set()
    for i, l in enumerate(confs['layers']):
        if i == 0:
            confs['layers'][0]['config']['batch_input_shape'] = model.layers[start].input_shape
        elif i < start or i > end:
            continue
        kept_layers.add(l['name'])
    # filter layers
    layers = [l for l in confs['layers'] if l['name'] in kept_layers]
    layers[1]['inbound_nodes'][0][0][0] = layers[0]['name']
    # set conf
    confs['layers'] = layers
    confs['input_layers'][0][0] = layers[0]['name']
    confs['output_layers'][0][0] = layers[-1]['name']
    # create new model
    newModel = Model.from_config(confs)
    for l in newModel.layers:
        l.set_weights(weights[l.name])
    return newModel

datumbox on Jan 21, 2017

Graph disconnected normally means your input and output are not part of the same graph. If your input was not the variable you used to create your output, this is the error you will get.

this is exactly my case. and how can i solve this issue when the input is not my variable ( my original data) ? thank you

daliboussaidi on Apr 15, 2020

Hello Everyone,

I am trying to do similar thing. I saved the intermediate output to the disk and now using this want to train and score the remaining model. Was there any addition to Keras API for this issue? @datumbox Were you able to find another solution?

kamleshkshirsagar on Jun 10, 2018

Yes I agree, you should definitely include it. If we visualize the model we will see that it makes absolutely no sense not to include the merge. 😃

My plan is to complete a less terrible solution which is more generic and allows you to split specific nodes. I’ll post it here once I have it or send a pull request.

datumbox on Feb 6, 2017

That was the first thing I tried but the Sequential API will not work with complicated models such as ResNet50 (see below the error message).

I understand now that splitting a non-Sequential model is not an easy task since a layer on the bottom can in theory take multiple inputs from higher layers. As I am not yet familiar with the low-level API of Keras, I wrote a terrible solution that extracts the json architecture, manipilates it and reconstructs the partial model. This works and actually speeds up the model-tuning of models by a factor of 3 but it is far from an elegant and generic solution. I can share the snippet if you want.

bottom = keras.applications.resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(shape=(224, 224, 3), name='input'))
m = Sequential()
for l in bottom.layers[0:141]:
	m.add(l)

Traceback (most recent call last): File “<stdin>”, line 2, in <module> File “/usr/local/lib/python2.7/dist-packages/keras/models.py”, line 324, in add output_tensor = layer(self.outputs[0]) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 517, in call self.add_inbound_node(inbound_layers, node_indices, tensor_indices) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 571, in add_inbound_node Node.create_node(self, inbound_layers, node_indices, tensor_indices) File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0])) File “/usr/local/lib/python2.7/dist-packages/keras/layers/normalization.py”, line 128, in call self.add_updates([K.moving_average_update(self.running_mean, mean, self.momentum), File “/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py”, line 364, in moving_average_update variable, value, momentum) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py”, line 70, in assign_moving_average update_delta = _zero_debias(variable, value, decay) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/moving_averages.py”, line 177, in _zero_debias trainable=False) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 1024, in get_variable custom_getter=custom_getter) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 850, in get_variable custom_getter=custom_getter) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 346, in get_variable validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 331, in _true_getter caching_device=caching_device, validate_shape=validate_shape) File “/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.py”, line 632, in _get_single_variable name, “”.join(traceback.format_list(tb)))) ValueError: Variable bn_conv1_running_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File “/usr/local/lib/python2.7/dist-packages/keras/backend/tensorflow_backend.py”, line 364, in moving_average_update variable, value, momentum) File “/usr/local/lib/python2.7/dist-packages/keras/layers/normalization.py”, line 128, in call self.add_updates([K.moving_average_update(self.running_mean, mean, self.momentum), File “/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py”, line 155, in create_node output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))

datumbox on Jan 21, 2017