tensorflow: Custom loss function is not working

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 20H2
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version: 2.3.0
  • Python version: 3.8.3

Standalone code to reproduce the issue https://colab.research.google.com/drive/1wa93OGIfXVY66GHO6wI1O9PnZyxXB-KJ

Describe the behavior I am implementing the PPO algorithm using Keras but encountered the following issue related to the custom loss function in Keras.

Error message:

Traceback (most recent call last):
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
  u/tf.function
  def has_init_scope():
    my_constant = tf.constant(1.)
    with tf.init_scope():
      added = my_constant * 2
The graph tensor has name: old_prediction_input:0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 426, in <module>
    agent.train()
  File "train.py", line 370, in train
    actor_loss = self.actor.fit(
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit
    tmp_logs = train_function(iterator)
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 1843, in _filtered_call
    return self._call_flat(
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 1923, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 545, in call
    outputs = execute.execute(
  File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\execute.py", line 72, in quick_execute
    raise core._SymbolicException(
tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'old_prediction_input:0' shape=(None, 2) dtype=float32>, <tf.Tensor 'advantage_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'reward_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'value_input:0' shape=(None, 1) dtype=float32>]

Check the custom loss function here on Colab

Check the actor model here on Colab

So after searching I found one work around i.e to add run_eagerly=True to the model.compile() method as: actor_model.compile(... , run_eagerly=True)

But after applying run_eagerly to true, I am getting 0 loss value from actor.history['loss'] and to debug this I am not able to print the total_loss value in the ppo_loss(…) function because it gives the AttributeError: 'Tensor' object has no attribute 'numpy'.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (5 by maintainers)

Most upvoted comments

Correct me if I am wrong.

The custom loss function will only work when a Tensor is returned and not a Symbolic KerasTensor or Symbolic Tensor.

So the main issue here is that custom loss function is returning a Symbolic KerasTensor and not a Tensor. And this is happening because inputs to the custom loss function are in Symbolic KerasTensor form.

So the question is: How to pass the extra parameters to the custom loss function in Tensor form instead of Symbolic KerasTensor?

Complete Report on Causes

From TensorFlow 2.3.0:

def ppo_loss(oldpolicy_probs, advantage, reward, value):
    def loss(y_true, y_pred):
        print(oldpolicy_probs) # Tensor("old_prediction_input:0", shape=(None, 2), dtype=float32)
        print(advantage)       # Tensor("advantage_input:0", shape=(None, 1), dtype=float32)
        print(reward)          # Tensor("reward_input:0", shape=(None, 1), dtype=float32)
        print(value)           # Tensor("value_input:0", shape=(None, 1), dtype=float32)

        print(y_true)     # Tensor("IteratorGetNext:5", shape=(128, 2), dtype=float32)
        print(y_pred)     # Tensor("functional_1/policy/Tanh:0", shape=(128, 2), dtype=float32)
        
        # ... Compute Loss ...
    return loss

From TensorFlow 2.5.0:

In custom loss function some of the data is in KerasTensor form and others in Tensor form.

def ppo_loss(oldpolicy_probs, advantage, reward, value):
    def loss(y_true, y_pred):
        print(oldpolicy_probs) # KerasTensor(type_spec=TensorSpec(shape=(None, 2), dtype=tf.float32, name='old_prediction_input'), name='old_prediction_input', description="created by layer 'old_prediction_input'")
        print(advantage)       # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='advantage_input'), name='advantage_input', description="created by layer 'advantage_input'")
        print(reward)          # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='reward_input'), name='reward_input', description="created by layer 'reward_input'")
        print(value)           # KerasTensor(type_spec=TensorSpec(shape=(None, 1), dtype=tf.float32, name='value_input'), name='value_input', description="created by layer 'value_input'")

        print(y_true)     # Tensor("IteratorGetNext:5", shape=(128, 2), dtype=float32)
        print(y_pred)     # Tensor("model/policy/Tanh:0", shape=(128, 2), dtype=float32)
        
        # ... Compute Loss ...
    return loss

So what to do in this case? How to pass the values to the custom loss functions so they are also in the Tensor form?

Please let us know if the provided information from documentation helps to resolve your issue.

Thankyou!

I may be misunderstanding, but it looks like the documented approach only allows fixed constant parameters, defined when the loss is created. For the OP and myself, we need the loss to be a function of things that change with each training example. An example would be if the loss wanted the model input in addition to y_true and y_pred.

Complete Report on Errors

Errors are generated from custom loss functions only.

✅ Works with TensorFlow 1.15.0:

https://colab.research.google.com/drive/1CdTzQZ-2mUeNebAv9qV-kW8WoAHX61lV


❌ Error from TensorFlow 2.3.0:

_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'old_prediction_input:0' shape=(None, 2) dtype=float32>, <tf.Tensor 'advantage_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'reward_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'value_input:0' shape=(None, 1) dtype=float32>]

Standalone code to reproduce the issue with Tensorflow 2.3.0 https://colab.research.google.com/drive/1wa93OGIfXVY66GHO6wI1O9PnZyxXB-KJ


❌ Error from TensorFlow 2.5.0:

TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.

Standalone code to reproduce the issue with Tensorflow 2.5.0 https://colab.research.google.com/drive/1dqVHrnqUmD5ZipzMwAJa0XcnZE1fh-HG

Similar error as mentioned in the below issues:

I have already updated the colab notebook with a standard loss function and, it works, so definitely, there is a problem with the custom loss function. I have also updated imports like keras to tensorflow.python.keras. Still, the error remains.

Hi @Dhyeythumar, this is a lot of custom code to look through. But often a cause of this error message is trying to use numpy ops in graph mode, or relying on python side effects (like appending to lists) instead of using TensorArrays in graph mode. I am not very familiar with gym so I can’t point to the exact issue in your code. However, I would suggest taking a look at the tf.function guide to get a sense of common pitfalls in graph mode. To isolate where the issue is, you might want to try running your code with a standard loss and see if it runs without this error.

Setting run_eagerly=True will get rid of the error message, but it’s not a great solution as it can really affect the performance of your code. This should really only be used for debugging purposes.

Additionally, you should update your import statements so you’re using tf.keras instead of keras eg from tensorflow.keras.layers import Input, Dense etc

Yes, I have also mentioned this in the above issue. So when I try to get the value from a tensor in ppo_loss() function I am getting AttributeError: 'Tensor' object has no attribute 'numpy' error. Basically, I am doing this to check why I am getting 0 in that ppo_loss function.

For now, I have updated the code on colab so we won’t get AttributeError: 'Tensor' object has no attribute 'numpy'.

@Dhyeythumar

I have tried in colab with TF version 2.3 and i am seeing different error message( AttributeError: 'Tensor' object has no attribute 'numpy’).Please, find the gist here.. Thanks!