tensorflow: AdamOptimizer's slots "beta1_power" and "beta2_power" are not available via "get_slot()" and "get_slot_names()"
Preamble.
I want to explicitly pass list of variables to tf.variables_initializer(). I do something like this:
model_variables = my_model.get_variables_list()
optimizer_slots = [
optimizer.get_slot(var, name)
for name in optimizer.get_slot_names()
for var in model_variables
]
all_variables = [
*model_variables,
*optimizer_slots,
global_step,
]
init_op = tf.variables_initializer(all_variables)
When I used the AdamOptimizer, I got such exception:
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value beta2_power
The problem.
After digging into the TensorFlow sources, I found that AdamOptimizer overrides its _create_slots() in that way:
def _create_slots(self, var_list):
# Create the beta1 and beta2 accumulators on the same device as the first
# variable.
if (self._beta1_power is None or
self._beta1_power.graph is not var_list[0].graph):
with ops.colocate_with(var_list[0]):
self._beta1_power = variables.Variable(self._beta1,
name="beta1_power",
trainable=False)
self._beta2_power = variables.Variable(self._beta2,
name="beta2_power",
trainable=False)
# Create slots for the first and second moments.
for v in var_list:
self._zeros_slot(v, "m", self._name)
self._zeros_slot(v, "v", self._name)
It creates two Variables and does not store them into self._slots, therefore I can not access them using public interface.
This problem refers to library’s API design.
I see that beta1_power and beta2_power slots are not subject to any variable while self._slots implies that each slot have both name and related variable, so there is no obvious solution… It may be reasonable to extend public API to cover such cases.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 22 (7 by maintainers)
The problem is that you call
model.optimizer.minimizetoo late. This methods creates additional tensors within your graph, so calling it within a loop is bad idea - it is something similar to a memory leak. Also, in case of stateful optimizers (such as AdamOptimizer)minimizecreates additional variables. That’s why you get exception you described - your initializer runs before you create them. The solution for you will be to place call tomodel.optimizer.minimizewithin the model class itself, and store its result in model`s attribute.So, your problem does not refer to this issue.
I do not know why this issue is closed, but for TF1.8.0 this issue is still alive. Those ‘beta1_power’ and ‘beta2_power’ are not within the variable_scope unlike other variables.
For some cases, “The solution for you will be to place call to model.optimizer.minimize within the model class itself, and store its result in model`s attribute.” does not help because you may want to separate(*) the part where you define the ‘network model’ with the one you define ‘training method’.
(*) If you wanna build modular programming having a catalog of learning algorithms, then this is the case to separate ‘network builder’ from ‘network trainer’
My take on this is to set two variable_scope context managers(a.k.a with) for each ‘_build’ and ‘_train’ parts for each with a same word ‘network’ starting with, and then use the word to aggregate whole variables within the two scopes: ex) with tf.variable_scope(‘network_build’) as net_build: build_network()
… with tf.variable_scope(‘network_train’) as net_train: define_training_method()
… tf.Session.run(tf.variables_initializer(var_list=tf.global_variables(scope=‘network’)))
The private method is sad because it can return wrong results if the optimizer is reused across many graphs (and, indeed, the optimizer itself is kinda wrong for that use-case).
I think we can make the slot API better by allowing variable-free slots (keyed by a graph, or, if None, the current default graph) and having Adam use those. Something called
graph_slot, orfree_slot?AdamOptimizeris the one built-in optimizer with such problem. So the simplest fast-and-dirty workaround is to do so: