tensorflow: tf.get_variable() cannot recognize existing variables
with tf.variable_scope("conv", reuse=True): x = tf.get_variable("w",[1])
The above code cannot recognize an existing variable, but clearly the existing variable was created before as print out of the variable .name shows : conv/w:0
I get an error when using tf.get_variable:
ValueError: Under-sharing: Variable conv/w does not exist, disallowed. Did you mean to set reuse=None in VarScope?
If set to reuse=None…
with tf.variable_scope("conv", reuse=None): x = tf.get_variable("w",[1])
Then it creates another variable with .name conv/w_1:0
It’s a bug! Now I have 2 variables, with names 'conv/w'
and 'conv/w_1'
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 22
- Comments: 66 (22 by maintainers)
Commits related to this issue
- Add save, restore, inc, logging, print_vars method - save, restore are not working. https://github.com/tensorflow/tensorflow/issues/1325 — committed to dongkwan-kim/DEN by dongkwan-kim 6 years ago
- Add save, restore, inc, logging, print_vars method - save, restore are not working. https://github.com/tensorflow/tensorflow/issues/1325 — committed to dongkwan-kim/DEN by dongkwan-kim 6 years ago
@ilblackdragon Friendly ping, please update this bug with the current status. Thanks!
@VerityStone Yes, I am using the same. I load the variables into a dictionary where each variable is referenced by its name. For inference, I use the same code which contains tf.nn.bidirectional_dynamic_rnn(), which means it will create the same set of variables but with an “_1” suffix in the name. It will also initialize each variable randomly.
I then overwrite each “_1” variable thus initialized, with the value of the corresponding variable in the dictionary, i.e., the ones without the “_1” . To do that, I use “for v in tf.trainable_variables()” along with sess.run(v.assign()).
Just keep in mind that this creates two superflous copies of the parameters in memory, the one that is loaded and the one in the dictionary. So, if you are running a multi-layer net, with large node sizes and input and/or output projections, you are looking at many GBs and may run out of memory.
Every TensorFlow operation (tf.Variable, tf.add, even t1 * t2 when t1 and t2 are tensors) creates, in the background, an op that is stored in the graph. So if you write y = t1 * t2 then you’re creating an Mul-Op in the global TF graph. It all happens in the background, you can look in python/framework/ops.py if you want to know more. The graph is created (and can be changed) in the with tf.Graph().as_default(): context.
(1) Yes: tf.Variable is a low-level function that will always create a variable node. It is used inside tf.get_variable. The variables can be local and global – the only difference is that it’s low-level. I’d recommend to never use it.
(2) It’s not about local and global, tf.Variable is just lower-level. It’s used inside tf.get_variable, so in principle you can do everything just with it, yes. But it will not respect variable scope, not use the initializer you defined there, not do partitioning, and not do bookkeeping. It’s just low-level. Use tf.get_variable to avoid problems.
I’m facing the same issue. I was wondering if somebody has found a way around it. I can’t really use the trainable variables hack that @rlrs suggested as I’m also using the default RNN code of tensorflow and don’t really want to hack into the rnn definitions for reusing the variables. It would be great if somebody can suggest a way around this without having the need to hack into the rnn code definition in tensorflow. Thanks!
Thank you for the answer. I had the same problem. On the other hand, I would like to add a note that if
tf.get_variable_scope().reuse_variable()
is (accidentally) used without a context managerwith tf.variable_scope(...):
, it will set any variable scope’sreuse
to True.Indeed – variables created in other ways than with tf.get_variable(…), esp. by the lower-level tf.Variable, are not added or recognized by variable_scope. This is partly intentional (as some special variables may need to be treated specially) and partly a result of how the variable sharing process developed.
My suggestion for you would be to always use tf.get_variable if you’re relying on it later – why not?
So the phenomenon that variables restored by Saver.restore func are not retrievable by get_variable is bug for fix?
Some input on this issue, variable loaded by
tf.Saver.restore()
are also not recognized bytf.get_variable()
. The cleanest method I could think of for accessing individual variables loaded as such was to create a dictionary with the variable names:I think you have a different notion of local and global than the TF one (which only concerns distributed training). In the notion you say, I think you’re right with (1) – but it’s hard for me to be clear about what “global” means for you. All nodes in TF are part of the graph, so they are global in this sense.
If you define something with tf.Variable, it will not be retrieved by tf.get_variable, that’s right. It still has a node name in the graph, so you can feed to it – but you need to be careful which suffix tf.Variable added to the name and be sure to use exactly this suffix.
Perhaps this should be made part of the documentation.