tensorflow: Keras application - Tensor is not an element of this graph on eval after train

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 10.13.1
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): v1.4.0-rc1-11-g130a514 1.4.0
Python version: 3.6.3
CUDA/cuDNN version: N/A CPU only
Exact command to reproduce:

Describe the problem

Using the estimator API and using tf.keras.applications.VGG16 and it’s output for transfer learning, I get an exception raised of TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("vgg_base/Placeholder:0", shape=(3, 3, 3, 64), dtype=float32) is not an element of this graph. when the model is run a second time.

This is raised when it runs the eval step after train from tf.estimator.train_and_evaluate. See source code for model and estimator output. This also occurs if I re-run the train_and_evaluate a second time. I am running in a Jupyter notebook and my assumption about memory is that if I do a Kernel ➝ Restart it will run a training run again without the error, but cannot be run in two executions without this.

See https://github.com/damienpontifex/fastai-course/blob/master/deeplearning1/lesson1%2B3/DogsVsCats.ipynb for full notebook, but main parts for estimator model and output are below:

Source code / logs

Estimator Model

def vgg16_model_fn(features, mode, params):
    
    is_training = mode == tf.estimator.ModeKeys.TRAIN
    
    with tf.variable_scope('vgg_base'):
        # Use a pre-trained VGG16 model and drop off the top layers as we will retrain 
        # with our own dense output for our custom classes
        vgg16_base = tf.keras.applications.VGG16(
            include_top=False,
            input_shape=(224, 224, 3),
            input_tensor=features['image'],
            pooling='avg')

        # Disable training for all layers to increase speed for transfer learning
        # If new classes significantely different from ImageNet, this may be worth leaving as trainable = True
        for layer in vgg16_base.layers:
            layer.trainable = False

        x = vgg16_base.output
    
    with tf.variable_scope("fc"):
        x = tf.layers.flatten(x)
        x = tf.layers.dense(x, units=4096, activation=tf.nn.relu, trainable=is_training, name='fc1')
        x = tf.layers.dense(x, units=4096, activation=tf.nn.relu, trainable=is_training, name='fc2')
        x = tf.layers.dropout(x, rate=0.5, training=is_training)
        
    # Finally add a 2 dense layer for class predictions
    with tf.variable_scope("Prediction"):
        x = tf.layers.dense(x, units=NUM_CLASSES, trainable=is_training)
        return x

Estimator setup

dog_cat_estimator = tf.estimator.Estimator(
    model_fn=model_fn,
    config=run_config,
    params=params
)
train_spec = tf.estimator.TrainSpec(
    input_fn=data_input_fn(train_record_filenames, num_epochs=None, batch_size=10, shuffle=True), 
    max_steps=10)
eval_spec = tf.estimator.EvalSpec(
    input_fn=data_input_fn(validation_record_filenames)
)
tf.estimator.train_and_evaluate(dog_cat_estimator, train_spec, eval_spec)

train_and_evaluate output

INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after 600 secs (eval_spec.throttle_secs) or training is finished.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Restoring parameters from /tmp/DogsVsCats/model.ckpt-1
INFO:tensorflow:Saving checkpoints for 2 into /tmp/DogsVsCats/model.ckpt.
INFO:tensorflow:loss = 0.0, step = 2
INFO:tensorflow:Saving checkpoints for 10 into /tmp/DogsVsCats/model.ckpt.
INFO:tensorflow:Loss for final step: 0.0.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1063             subfeed_t = self.graph.as_graph_element(subfeed, allow_tensor=True,
-> 1064                                                     allow_operation=False)
   1065           except Exception as e:

/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in as_graph_element(self, obj, allow_tensor, allow_operation)
   3034     with self._lock:
-> 3035       return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
   3036 

/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in _as_graph_element_locked(self, obj, allow_tensor, allow_operation)
   3113       if obj.graph is not self:
-> 3114         raise ValueError("Tensor %s is not an element of this graph." % obj)
   3115       return obj

ValueError: Tensor Tensor("vgg_base/Placeholder:0", shape=(3, 3, 3, 64), dtype=float32) is not an element of this graph.

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-12-67c818ea66c5> in <module>()
----> 1 tf.estimator.train_and_evaluate(dog_cat_estimator, train_spec, eval_spec)

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py in train_and_evaluate(estimator, train_spec, eval_spec)
    428       config.task_type != run_config_lib.TaskType.EVALUATOR):
    429     logging.info('Running training and evaluation locally (non-distributed).')
--> 430     executor.run_local()
    431     return
    432 

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py in run_local(self)
    614       # condition is satisfied (both checks use the same global_step value,
    615       # i.e., no race condition)
--> 616       metrics = evaluator.evaluate_and_export()
    617 
    618       if not metrics:

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/training.py in evaluate_and_export(self)
    749           name=self._eval_spec.name,
    750           checkpoint_path=latest_ckpt_path,
--> 751           hooks=self._eval_spec.hooks)
    752 
    753       if not eval_result:

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py in evaluate(self, input_fn, steps, hooks, checkpoint_path, name)
    353         hooks=hooks,
    354         checkpoint_path=checkpoint_path,
--> 355         name=name)
    356 
    357   def _convert_eval_steps_to_hooks(self, steps):

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py in _evaluate_model(self, input_fn, hooks, checkpoint_path, name)
    808           input_fn, model_fn_lib.ModeKeys.EVAL)
    809       estimator_spec = self._call_model_fn(
--> 810           features, labels, model_fn_lib.ModeKeys.EVAL, self.config)
    811 
    812       if model_fn_lib.LOSS_METRIC_KEY in estimator_spec.eval_metric_ops:

/usr/local/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
    692     if 'config' in model_fn_args:
    693       kwargs['config'] = config
--> 694     model_fn_results = self._model_fn(features=features, **kwargs)
    695 
    696     if not isinstance(model_fn_results, model_fn_lib.EstimatorSpec):

<ipython-input-8-e251e8b8fccf> in model_fn(features, labels, mode, params)
      3     tf.summary.image('images', features['image'], max_outputs=6)
      4 
----> 5     logits = vgg16_model_fn(features, mode, params)
      6 
      7     # Dictionary with label as outcome with greatest probability

<ipython-input-7-93330b8a5aa6> in vgg16_model_fn(features, mode, params)
     10             input_shape=(224, 224, 3),
     11             input_tensor=features['image'],
---> 12             pooling='avg')
     13 
     14         # Disable training for all layers to increase speed for transfer learning

/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/applications/vgg16.py in VGG16(include_top, weights, input_tensor, input_shape, pooling, classes)
    199           WEIGHTS_PATH_NO_TOP,
    200           cache_subdir='models')
--> 201     model.load_weights(weights_path)
    202     if K.backend() == 'theano':
    203       layer_utils.convert_all_kernels_in_model(model)

/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/topology.py in load_weights(self, filepath, by_name)
   1097       load_weights_from_hdf5_group_by_name(f, self.layers)
   1098     else:
-> 1099       load_weights_from_hdf5_group(f, self.layers)
   1100 
   1101     if hasattr(f, 'close'):

/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/engine/topology.py in load_weights_from_hdf5_group(f, layers)
   1484                        str(len(weight_values)) + ' elements.')
   1485     weight_value_tuples += zip(symbolic_weights, weight_values)
-> 1486   K.batch_set_value(weight_value_tuples)
   1487 
   1488 

/usr/local/lib/python3.6/site-packages/tensorflow/python/keras/_impl/keras/backend.py in batch_set_value(tuples)
   2404       assign_ops.append(assign_op)
   2405       feed_dict[assign_placeholder] = value
-> 2406     get_session().run(assign_ops, feed_dict=feed_dict)
   2407 
   2408 

/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    887     try:
    888       result = self._run(None, fetches, feed_dict, options_ptr,
--> 889                          run_metadata_ptr)
    890       if run_metadata:
    891         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1065           except Exception as e:
   1066             raise TypeError('Cannot interpret feed_dict key as Tensor: '
-> 1067                             + e.args[0])
   1068 
   1069           if isinstance(subfeed_val, ops.Tensor):

TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("vgg_base/Placeholder:0", shape=(3, 3, 3, 64), dtype=float32) is not an element of this graph.

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 39 (3 by maintainers)

Links to this issue

Commits related to this issue

Make NN to work in Multithreading: https://github.com/tensorflow/tensorflow/issues/14356 — committed to MbProg/BughouseAlphaZero by MbProg 5 years ago

Most upvoted comments

K.clear_session() did not work for me

however, what worked was :

def load_model():
	global model
	model = ResNet50(weights="imagenet")
            # this is key : save the graph after loading the model
	global graph
	graph = tf.get_default_graph()

While predicting, use the same graph

    with graph.as_default():
	preds = model.predict(image)
	#... etc

+258

anujgupta82 on Jun 1, 2018

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

+207

bignamehyp on Nov 8, 2017

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

+190

bignamehyp on Nov 8, 2017

This worked for me from keras import backend as K and after predicting my data i inserted this part of code K.clear_session()

+57

mohammedyunus009 on May 16, 2018

The reason why the code from @anujgupta82 works is given in this StackOverFlow answer.

Flask uses multiple threads. The problem you are running into is because the tensorflow model is not loaded and used in the same thread. One workaround is to force tensorflow to use the gloabl default graph .

+29

atakanarikan on Dec 15, 2018

@bignamehyp Someone already asked a similar question on stack overflow.

The solution is to call tf.keras.backend.clear_session() after the call to train(). However, this won’t work if the user wants to use train_and_evaluate() since there is no place to call clear_session().

+20

hsm207 on Nov 9, 2017

@bignamehyp I had assumed this was a bug as it seems to be occurring with variables setup inside ‘tf.keras.applications.VGG16’ rather than any I had setup. Thoughts?

+17

damienpontifex on Nov 8, 2017

Same problem here when trying to make an inference using a keras pre-trained model from a flask application. Thanks @anujgupta82 !

+12

vitojph on Jun 8, 2018

this works for me, @shaoeChen how is this working for you? It turns out this way does not need a clear_session call and is at the same time configuration friendly

from keras.backend.tensorflow_backend import set_session
# load_keras_model.py
class LoadKerasModel:
    model = None
    graph = None

    def __init__(self):
        config = self.keras_resource()
        self.init_model(config)

    def init_model(self, _config, *args):
        session = tf.Session(config=_config)
        self.graph = session.graph
        set_session(session)
        self.model = load_model(file_path)

    def keras_resource(self):
        num_cores = 4

        if os.getenv('TENSORFLOW_VERSION') == 'GPU':
            num_gpu = 1
            num_cpu = 1
        elif os.getenv('TENSORFLOW_VERSION') == 'CPU':
            num_gpu = 0
            num_cpu = 1
        else:
            raise NonResourceException()

        config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
                                inter_op_parallelism_threads=num_cores, allow_soft_placement=True,
                                device_count={'CPU': num_cpu, 'GPU': num_gpu})
        config.gpu_options.allow_growth = True
        
        return config

    def predict_target(self, img_generator):
        with self.graph.as_default():
            predict = self.model.predict_generator(
                img_generator,
                steps=len(img_generator),
                verbose=1
            )
        return predict

load_keras_model = LoadKerasModel()
load_keras_model.predict_target(np.ones((1, 1, 1, 1))) #img_generator

ArashHosseini on Feb 25, 2019

@bignamehyp does this information from @hsm207 provide any further insights? If I have to call clear_session() between runs, this would seem to be unexpected behaviour and be a bug?

Just still not sure why it’s happening to provide insights on a potential solution.

damienpontifex on Nov 10, 2017

If you find this problem, try to write K.clear_session() when you secondly use your function for establishing your graph. Besides, you should reload the model and predict it randomly with a simple input. I fixed my code just like this:

uncerts_normal = get_mc_predictions(model, X_test, Y_label,
batch_size=args.batch_size) 
.var(axis=0)#.mean(axis=1)
print(uncerts_normal.shape)
uncerts_normal1 = l2_normalize(a, axis=-1)
K.clear_session() 
model = load_model('../data/model_%s.h5' % args.dataset)
print('testing model1:', model.predict(np.zeros((1, 28, 28, 1))))
uncerts_noisy = get_mc_predictions(model, X_test_noisy,Y_label,
                                   batch_size=args.batch_size).var(axis=0)

Qmoliang on Apr 23, 2018

K.clear_session() did not work for me

however, what worked was :

def load_model():
	global model
	model = ResNet50(weights="imagenet")
            # this is key : save the graph after loading the model
	global graph
	graph = tf.get_default_graph()

While predicting, use the same graph

    with graph.as_default():
	preds = model.predict(image)
	#... etc

Thanks. I struggled the same problem for half a day and solved it as your suggestion.

ironmanciti on Dec 16, 2018

I have encountered this error in a code I was working with, and none of the above answers worked for me.

What I found as the problem was that the code had mixed uses of keras and tensorflow.keras, and using keras.backend.clear_session() instead of tensorflow.keras.backend.clear_session() broke everything after the network was trained for the first time.

thanks, I got the same problem with you, and follow your answer, I fixed this problem.

Jason7602 on Jul 10, 2019

I have encountered this error in a code I was working with, and none of the above answers worked for me.

What I found as the problem was that the code had mixed uses of keras and tensorflow.keras, and using keras.backend.clear_session() instead of tensorflow.keras.backend.clear_session() broke everything after the network was trained for the first time.

joaofbsm on May 31, 2019

K.clear_session() did not work for me

however, what worked was :

def load_model():
	global model
	model = ResNet50(weights="imagenet")
            # this is key : save the graph after loading the model
	global graph
	graph = tf.get_default_graph()

While predicting, use the same graph

    with graph.as_default():
	preds = model.predict(image)
	#... etc

god among men. Worked.

conradbm on Dec 10, 2018

clear_session()

In my case, load_model() works for the first time but not afterward. If you are experiencing the same issue, you need to clear_session() after each time you load the model!

alanhyue on Oct 1, 2018

The solution from @anujgupta82 worked for me too. But, can someone help me to understand what is going on?

nikhilkuria on Jun 19, 2018