tensorflow: Gradient computation erroneously returns None

In [5]: tf.gradients(tf.constant(5), tf.Variable(0))
Out[5]: [None]

The derivative of 5 with respect to x should be 0.

About this issue

Original URL
State: closed
Created 8 years ago
Reactions: 15
Comments: 20 (10 by maintainers)

Commits related to this issue

Merge pull request #783 from dweekly/master Update sum_of_squares (TF 0.10) to mean_squared_error (TF 0.12) — committed to tarasglek/tensorflow by nealwu 7 years ago
Eager/g3doc: Gradients with respect to constants are None and not 0. Same behavior as tf.gradients() for graphs. Some discussion of this choice in #783 PiperOrigin-RevId: 190096919 — committed to benoitsteiner/tensorflow by asimshankar 6 years ago

Most upvoted comments

If it’s called “gradients” then I expect it to compute gradients… The gradient of e.g. 5 with respect to some vector v is most certainly the zero vector.

Also just from a practicality standpoint, you should be able to compute gradients and then perform mathematical operations with them without having to worry about something unexpectedly becoming a non-Tensor and causing an exception to be raised. In some TF code I wrote recently I had to make the following function to avoid this bug:

def _compute_gradients(tensor, var_list):
  grads = tf.gradients(tensor, var_list)
  return [grad if grad is not None else tf.zeros_like(var)
          for var, grad in zip(var_list, grads)]

+24

joshburkart on Jan 20, 2016

Actually, there’s a wrinkle: None is used to indicate a variety of different things:

There is no connection from input to output.
There is a connection, but it’s through a discrete variable with meaningless gradients.
There is a connection, but it’s through an op that doesn’t have an implemented gradient.

(3) in particular would be very bad to replace with zeros.

+20

girving on Jan 27, 2016

Hmm, I get the efficiency argument. I agree that a keyword arg could be a suitable workaround. E.g.

tf.gradients(tf.constant(5), tf.Variable(0), return_zeros=True)

+16

joshburkart on Jan 26, 2016

I’m not sure this isn’t the desired behavior. Returning None makes it explicit that there is no graph connection between the two.

girving on Jan 20, 2016

I’ve been using this work-around

def replace_none_with_zero(l): return [0 if i==None else i for i in l]

grads = replace_none_with_zero(tf.gradients([grads[0]], [x, y]))

On Tue, Aug 9, 2016 at 4:39 PM, Geoffrey Irving notifications@github.com wrote:

Closed #783 https://github.com/tensorflow/tensorflow/issues/783.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/783#event-750619754, or mute the thread https://github.com/notifications/unsubscribe-auth/AABaHM0gn5SXMbEBDAZ_iSIYWmRFNQO4ks5qeQ-qgaJpZM4HFmRr .

yaroslavvb on Aug 10, 2016

It looks like there are hundreds of direct uses of tf.gradients within Google, so I don’t think a silent performance breaking change is okay. If we’re going to change the default behavior, I think the only way would be to make the special Zeros class and give it suitable arithmetic overloads. That way, anyone who doesn’t realize and treats it as a normal tensor in a way that doesn’t take advantage of zeros will throw an exception.

For now, how about a return_zeros argument that defaults to False?

girving on Jan 26, 2016

Then I take it the official documentation of tf.gradients() is erroneous?

unconnected_gradients determines the value returned for each x in xs if it is unconnected in the graph to ys. By default this is None to safeguard against errors. MAthematically these gradients are zero which can be requested using the ‘zero’ option. tf.UnconnectedGradients provides the following options and behaviors:
a = tf.ones([1, 2])
b = tf.ones([3, 1])
g1 = tf.gradients([b], [a], unnconnected_gradients='none')
sess.run(g1)  # [None]

g2 = tf.gradients([b], [a], unconnected_gradients='zero')
sess.run(g2)  # [array([[0., 0.]], dtype=float32)]

There doesn’t seem to be such a keyword argument though.

ShnitzelKiller on Feb 28, 2019

I think I’ve just ran into this:

with tf.Graph().as_default():
  with tf.Session() as sess:
    x = tf.constant([1.0])
    x_double = 2*x
    print sess.run(tf.hessians(x_double, x))

# ValueError: None values not supported.

TEHEK on May 14, 2018

Yep, I agree that this is a bug; my comment was poorly worded. @mrry: Do we use the None feature anywhere that changing this would break? It could certainly cause some code to get slower, which is a potential concern.

girving on Jan 20, 2016