tensorflow: Gradient computation erroneously returns None

In [5]: tf.gradients(tf.constant(5), tf.Variable(0))
Out[5]: [None]

The derivative of 5 with respect to x should be 0.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 15
  • Comments: 20 (10 by maintainers)

Commits related to this issue

Most upvoted comments

If it’s called “gradients” then I expect it to compute gradients… The gradient of e.g. 5 with respect to some vector v is most certainly the zero vector.

Also just from a practicality standpoint, you should be able to compute gradients and then perform mathematical operations with them without having to worry about something unexpectedly becoming a non-Tensor and causing an exception to be raised. In some TF code I wrote recently I had to make the following function to avoid this bug:

def _compute_gradients(tensor, var_list):
  grads = tf.gradients(tensor, var_list)
  return [grad if grad is not None else tf.zeros_like(var)
          for var, grad in zip(var_list, grads)]

Actually, there’s a wrinkle: None is used to indicate a variety of different things:

  1. There is no connection from input to output.
  2. There is a connection, but it’s through a discrete variable with meaningless gradients.
  3. There is a connection, but it’s through an op that doesn’t have an implemented gradient.

(3) in particular would be very bad to replace with zeros.

Hmm, I get the efficiency argument. I agree that a keyword arg could be a suitable workaround. E.g.

tf.gradients(tf.constant(5), tf.Variable(0), return_zeros=True)

I’m not sure this isn’t the desired behavior. Returning None makes it explicit that there is no graph connection between the two.

I’ve been using this work-around

def replace_none_with_zero(l): return [0 if i==None else i for i in l]

grads = replace_none_with_zero(tf.gradients([grads[0]], [x, y]))

On Tue, Aug 9, 2016 at 4:39 PM, Geoffrey Irving notifications@github.com wrote:

Closed #783 https://github.com/tensorflow/tensorflow/issues/783.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/783#event-750619754, or mute the thread https://github.com/notifications/unsubscribe-auth/AABaHM0gn5SXMbEBDAZ_iSIYWmRFNQO4ks5qeQ-qgaJpZM4HFmRr .

It looks like there are hundreds of direct uses of tf.gradients within Google, so I don’t think a silent performance breaking change is okay. If we’re going to change the default behavior, I think the only way would be to make the special Zeros class and give it suitable arithmetic overloads. That way, anyone who doesn’t realize and treats it as a normal tensor in a way that doesn’t take advantage of zeros will throw an exception.

For now, how about a return_zeros argument that defaults to False?

Then I take it the official documentation of tf.gradients() is erroneous?

unconnected_gradients determines the value returned for each x in xs if it is unconnected in the graph to ys. By default this is None to safeguard against errors. MAthematically these gradients are zero which can be requested using the ‘zero’ option. tf.UnconnectedGradients provides the following options and behaviors:

a = tf.ones([1, 2])
b = tf.ones([3, 1])
g1 = tf.gradients([b], [a], unnconnected_gradients='none')
sess.run(g1)  # [None]

g2 = tf.gradients([b], [a], unconnected_gradients='zero')
sess.run(g2)  # [array([[0., 0.]], dtype=float32)]

There doesn’t seem to be such a keyword argument though.

I think I’ve just ran into this:

with tf.Graph().as_default():
  with tf.Session() as sess:
    x = tf.constant([1.0])
    x_double = 2*x
    print sess.run(tf.hessians(x_double, x))

# ValueError: None values not supported.

=\

Yep, I agree that this is a bug; my comment was poorly worded. @mrry: Do we use the None feature anywhere that changing this would break? It could certainly cause some code to get slower, which is a potential concern.