tensorflow: Tensorflow restore model and retrain - valueerror : Duplicate node name in graph
I am trying to restore the trained model and retrain it with some additional operations.
I have 2 python files, lets say
train.py - To train and save the model retrain.py - Load the trained model, add new elements in graph and retrain
train.py
def train():
# 1 NN
Xinp1 = tf.placeholder("float", [None, 2], name="Xinp1")
Xhidden1 = tf.layers.dense(Xinp1, units=16 ,
kernel_initializer=tf.initializers.he_uniform(),
activation=tf.nn.relu, name="X_hidden1")
Xout = tf.layers.dense(X_hidden5, units=1,
kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.sigmoid, name="X_out")
Xout1 = tf.identity(Xout, name="Xout1")
#2 NN
Xinp2 = tf.placeholder("float", [None, 2], name="Xinp2")
Xhidden2 = tf.layers.dense(Xinp2, units=16 ,
kernel_initializer=tf.initializers.he_uniform(),
activation=tf.nn.relu, name="X_hidden2")
Xout = tf.layers.dense(X_hidden2, units=1,
kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.sigmoid, name="X_out")
Xout2 = tf.identity(Xout, name="Xout2")
Xout1_label = tf.placeholder("float", [None,1], name="Xout1_label")
Xout2_label = tf.placeholder("float", [None,1],name="Xout2_label")
learning_rate = 1e-2
# Define loss and optimizer
loss_op1 = tf.losses.absolute_difference(Xout1_label, Xout1)
loss_op2 = tf.losses.absolute_difference(Xout2_label, Xout2)
# debug gradients
trainables = tf.trainable_variables()
print ("trainables", trainables)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)
train_op1 = optimizer.minimize(loss_op1)
train_op2 = optimizer.minimize(loss_op2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver()
for _ in range(100):
_, c1, summary = sess.run([train_op1, loss_op1, merged_summary_op], feed_dict={
Xinp1: X1,
Xinp2: X2,
Xout1_label: X1label,
Xout2_label: X2label
})
_, c2, summary = sess.run([train_op2, loss_op2, merged_summary_op], feed_dict={
Xinp1: X1,
Xinp2: X2,
Xout1_label: X1label,
Xout2_label: X2label
})
saver.save(sess, 'Model/trained.ckpt')
sess.close()
As an output, I got following files
- checkpoint
- trained.ckpt.data-00000-of-00001
- trained.ckpt.index
- trained.ckpt.meta
retrain.py
def retrain():
with tf.Session() as sess:
saver = tf.train.import_meta_graph('Model/trained.ckpt.meta')
saver.restore(sess, 'Model/trained.ckpt')
graph = tf.get_default_graph()
Xinp1 = graph.get_tensor_by_name('Xinp1:0')
Xout1 = graph.get_tensor_by_name('Xout1:0')
Xinp2 = graph.get_tensor_by_name('Xinp2:0')
Xout2 = graph.get_tensor_by_name('Xout2:0')
# I want to add some additional nodes
T1 = tf.placeholder("float", [None, 1], name="T1")
T2 = tf.placeholder("float", [None, 1], name="T2")
Add1 = tf.add(tf.multiply(Xout1, tf.subtract(T1, T2)), T2, name="Add1_out")
T3 = tf.placeholder("float", [None, 1], name="T3")
Add2 = tf.multiply(tf.multiply(T3,tf.subtract(Add1, 300)),tf.multiply(radial_length,0.000001), name="Add2_out")
Addlabel = tf.placeholder("float", [None, 1], name="Addlabel")
loss_op = tf.losses.mean_squared_error(Addlabel, Add2)
# debug gradients
trainables = tf.trainable_variables()
print ("trainables", trainables)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate, epsilon=0.1)
train_op = optimizer.minimize(loss_op)
sess.run(tf.global_variables_initializer())
#training starts
# Here I except weights of 1 NN and 2 NN are learned during the training
for _ in range(100):
_, c, summary = sess.run([train_op, loss_op, merged_summary_op], feed_dict={
Xinp1 : NewX1,
Xinp2 : NewX2,
T1 : T1inp,
T2 : T2inp,
T3 : T3inp,
Addlabel : Addtarget
})
I am expecting the retrain.py to adjust the weights associated with 1 NN and 2 NN during the training.
But instead while running the retrain.py, I am getting the following error
Traceback (most recent call last):
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Duplicate node name in graph: 'X_hidden1/kernel/Adam'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/itmsec/Documents/tipclearance/src/TTG_tensorflowv14.py", line 493, in <module>
restore_and_retrain(BDD)
File "/home/itmsec/Documents/tipclearance/src/TTG_tensorflowv14.py", line 244, in restore_and_retrain
train_op = optimizer.minimize(loss_op)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 413, in minimize
name=name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 595, in apply_gradients
self._create_slots(var_list)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/adam.py", line 135, in _create_slots
self._zeros_slot(v, "m", self._name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/optimizer.py", line 1153, in _zeros_slot
new_slot_variable = slot_creator.create_zeros_slot(var, op_name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 183, in create_zeros_slot
colocate_with_primary=colocate_with_primary)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 157, in create_slot_with_initializer
dtype)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/slot_creator.py", line 65, in _create_slot_var
validate_shape=validate_shape)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1479, in get_variable
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1220, in get_variable
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 547, in get_variable
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 499, in _true_getter
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 911, in _get_single_variable
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 213, in __call__
return cls._variable_v1_call(*args, **kwargs)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 176, in _variable_v1_call
aggregation=aggregation)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 155, in <lambda>
previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 217, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1395, in __init__
constraint=constraint)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1509, in _init_from_args
name=name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/state_ops.py", line 79, in variable_op_v2
shared_name=shared_name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_state_ops.py", line 1425, in variable_v2
shared_name=shared_name, name=name)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1823, in __init__
control_input_ops)
File "/home/itmsec/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1662, in _create_c_op
raise ValueError(str(e))
ValueError: Duplicate node name in graph: 'X_hidden1/kernel/Adam'
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16 (6 by maintainers)
The reason is that AdamOptimizer creates additional variables and operation in your graph. When you store your model, those operations are stored and loaded with the graph when you restore the model. If you run
you can see the list of operations that are loaded with you model. You will see operations that have /Adam or train/Adam init. When you try to find-tune or reuse you model, the new AdamOptimizer tries to create those operations again, hence it raises the “Duplicate node name” error. One way to fix the issue is to give a name to your new AdampOptimzer.
However, We are not done yet. As you want to reuse the weight, you cannot initialize variable. However, if you will get error of uninitialized parameters when you run your training which is raised due to new AdamOptimizer variables which have not been initialized yet. To get around it, you need to initialize those new variables by :
Note: Unused nodes will not be executed and hence they won’t affect training time.
Thanks for your reply. Actually I have figured out this problem, I use tf 1.8 and tf.initialize_variables has been discarded. I use
sess.run(tf.variables_initializer(uninitialized_vars))and it works well. I have not yet tried any other optimizers, but I believe your solution will work for them too…OS: Ubuntu 18 Tensorflow CPU version I installed Tensorflow via pip
The above provided code snippet is complete, I have just ignored only the pandas dataframe which is passed in the function.