tensorflow: Tensorflow restore model and retrain - valueerror : Duplicate node name in graph

I am trying to restore the trained model and retrain it with some additional operations.

I have 2 python files, lets say

train.py - To train and save the model retrain.py - Load the trained model, add new elements in graph and retrain

train.py

def train():
    # 1 NN
    Xinp1 = tf.placeholder("float", [None, 2], name="Xinp1")
    Xhidden1 = tf.layers.dense(Xinp1, units=16 , 
                kernel_initializer=tf.initializers.he_uniform(), 
                activation=tf.nn.relu, name="X_hidden1")
    Xout = tf.layers.dense(X_hidden5, units=1, 
 kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.sigmoid, name="X_out")

    Xout1 = tf.identity(Xout, name="Xout1")

    #2 NN
    Xinp2 = tf.placeholder("float", [None, 2], name="Xinp2")
    Xhidden2 = tf.layers.dense(Xinp2, units=16 , 
                kernel_initializer=tf.initializers.he_uniform(), 
                activation=tf.nn.relu, name="X_hidden2")
    Xout = tf.layers.dense(X_hidden2, units=1, 
kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.sigmoid, name="X_out")

    Xout2 = tf.identity(Xout, name="Xout2")

    Xout1_label = tf.placeholder("float", [None,1], name="Xout1_label")
    Xout2_label = tf.placeholder("float", [None,1],name="Xout2_label")


    learning_rate = 1e-2
    # Define loss and optimizer
    loss_op1 = tf.losses.absolute_difference(Xout1_label, Xout1)
    loss_op2 = tf.losses.absolute_difference(Xout2_label, Xout2)



    # debug gradients
    trainables = tf.trainable_variables()
    print ("trainables", trainables)
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)

    train_op1 = optimizer.minimize(loss_op1)
    train_op2 = optimizer.minimize(loss_op2)

    with tf.Session() as sess:
          sess.run(tf.global_variables_initializer())
          saver = tf.train.Saver()
          for _ in range(100):
               _, c1, summary = sess.run([train_op1, loss_op1, merged_summary_op], feed_dict={
            Xinp1: X1,
            Xinp2: X2,
            Xout1_label: X1label,
            Xout2_label: X2label
            })    
               _, c2, summary = sess.run([train_op2, loss_op2, merged_summary_op], feed_dict={
            Xinp1: X1,
            Xinp2: X2,
            Xout1_label: X1label,
            Xout2_label: X2label
            })        
          saver.save(sess, 'Model/trained.ckpt')
          sess.close()

As an output, I got following files

  1. checkpoint
  2. trained.ckpt.data-00000-of-00001
  3. trained.ckpt.index
  4. trained.ckpt.meta

retrain.py

def retrain():
     with tf.Session() as sess:
           saver = tf.train.import_meta_graph('Model/trained.ckpt.meta')
           saver.restore(sess, 'Model/trained.ckpt')
           graph = tf.get_default_graph()
           Xinp1 = graph.get_tensor_by_name('Xinp1:0')
           Xout1 = graph.get_tensor_by_name('Xout1:0')
           Xinp2 = graph.get_tensor_by_name('Xinp2:0')
           Xout2 = graph.get_tensor_by_name('Xout2:0') 

           # I want to add some additional nodes
           T1 = tf.placeholder("float", [None, 1], name="T1")
           T2 = tf.placeholder("float", [None, 1], name="T2")
           Add1 = tf.add(tf.multiply(Xout1, tf.subtract(T1, T2)), T2, name="Add1_out")

           T3 = tf.placeholder("float", [None, 1], name="T3")
           Add2 = tf.multiply(tf.multiply(T3,tf.subtract(Add1, 300)),tf.multiply(radial_length,0.000001), name="Add2_out")

           Addlabel = tf.placeholder("float", [None, 1], name="Addlabel")

           loss_op = tf.losses.mean_squared_error(Addlabel, Add2)

           # debug gradients
           trainables = tf.trainable_variables()
           print ("trainables", trainables)
           optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)
           train_op = optimizer.minimize(loss_op)

           sess.run(tf.global_variables_initializer())
           #training starts
           # Here I except weights of 1 NN and 2 NN are learned during the training
           for _ in range(100):
               _, c, summary = sess.run([train_op, loss_op, merged_summary_op], feed_dict={
               Xinp1 : NewX1,
               Xinp2 : NewX2,
               T1 : T1inp,
               T2 : T2inp,
               T3 : T3inp,
               Addlabel : Addtarget               
                }) 

I am expecting the retrain.py to adjust the weights associated with 1 NN and 2 NN during the training.

But instead while running the retrain.py, I am getting the following error

Traceback (most recent call last):
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Duplicate node name in graph: 'X_hidden1/kernel/Adam'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/itmsec/Documents/tipclearance/src/TTG_tensorflowv14.py", line 493, in <module>
    restore_and_retrain(BDD)
  File "/home/itmsec/Documents/tipclearance/src/TTG_tensorflowv14.py", line 244, in restore_and_retrain
    train_op = optimizer.minimize(loss_op)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 413, in minimize
    name=name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 595, in apply_gradients
    self._create_slots(var_list)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/adam.py", line 135, in _create_slots
    self._zeros_slot(v, "m", self._name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 1153, in _zeros_slot
    new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 183, in create_zeros_slot
    colocate_with_primary=colocate_with_primary)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 157, in create_slot_with_initializer
    dtype)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 65, in _create_slot_var
    validate_shape=validate_shape)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1479, in get_variable
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1220, in get_variable
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 547, in get_variable
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 499, in _true_getter
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 911, in _get_single_variable
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 213, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 176, in _variable_v1_call
    aggregation=aggregation)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 155, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
    expected_shape=expected_shape, import_scope=import_scope)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 217, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1395, in __init__
    constraint=constraint)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1509, in _init_from_args
    name=name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 79, in variable_op_v2
    shared_name=shared_name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 1425, in variable_v2
    shared_name=shared_name, name=name)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1823, in __init__
    control_input_ops)
  File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
    raise ValueError(str(e))
ValueError: Duplicate node name in graph: 'X_hidden1/kernel/Adam'

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (6 by maintainers)

Most upvoted comments

The reason is that AdamOptimizer creates additional variables and operation in your graph. When you store your model, those operations are stored and loaded with the graph when you restore the model. If you run

tf.Graph.get_operations(graph)

you can see the list of operations that are loaded with you model. You will see operations that have /Adam or train/Adam init. When you try to find-tune or reuse you model, the new AdamOptimizer tries to create those operations again, hence it raises the “Duplicate node name” error. One way to fix the issue is to give a name to your new AdampOptimzer.

opt = tf.train.AdamOptimizer(2e-4m name=‘MyNewAdam’).minimize(Loss)

However, We are not done yet. As you want to reuse the weight, you cannot initialize variable. However, if you will get error of uninitialized parameters when you run your training which is raised due to new AdamOptimizer variables which have not been initialized yet. To get around it, you need to initialize those new variables by :

uninitialized_vars = []
for var in tf.all_variables():
    try:
        sess.run(var)
    except tf.errors.FailedPreconditionError:
        uninitialized_vars.append(var)

tf.initialize_variables(uninitialized_vars)

Note: Unused nodes will not be executed and hence they won’t affect training time.

Thanks for your reply. Actually I have figured out this problem, I use tf 1.8 and tf.initialize_variables has been discarded. I use sess.run(tf.variables_initializer(uninitialized_vars)) and it works well. I have not yet tried any other optimizers, but I believe your solution will work for them too…

OS: Ubuntu 18 Tensorflow CPU version I installed Tensorflow via pip

The above provided code snippet is complete, I have just ignored only the pandas dataframe which is passed in the function.