tensorflow: Custom loss function is not working
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 20H2
- TensorFlow installed from (source or binary): binary
- TensorFlow version: 2.3.0
- Python version: 3.8.3
Standalone code to reproduce the issue https://colab.research.google.com/drive/1wa93OGIfXVY66GHO6wI1O9PnZyxXB-KJ
Describe the behavior I am implementing the PPO algorithm using Keras but encountered the following issue related to the custom loss function in Keras.
Error message:
Traceback (most recent call last):
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
TypeError: An op outside of the function building code is being passed
a "Graph" tensor. It is possible to have Graph tensors
leak out of the function building context by including a
tf.init_scope in your function building code.
For example, the following function will fail:
u/tf.function
def has_init_scope():
my_constant = tf.constant(1.)
with tf.init_scope():
added = my_constant * 2
The graph tensor has name: old_prediction_input:0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 426, in <module>
agent.train()
File "train.py", line 370, in train
actor_loss = self.actor.fit(
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 1843, in _filtered_call
return self._call_flat(
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\function.py", line 545, in call
outputs = execute.execute(
File "C:\Users\dhyey\Desktop\Train-ml-agents\python-envs\offline_training\lib\site-packages\tensorflow\python\eager\execute.py", line 72, in quick_execute
raise core._SymbolicException(
tensorflow.python.eager.core._SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'old_prediction_input:0' shape=(None, 2) dtype=float32>, <tf.Tensor 'advantage_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'reward_input:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'value_input:0' shape=(None, 1) dtype=float32>]
Check the custom loss function here on Colab
Check the actor model here on Colab
So after searching I found one work around i.e to add run_eagerly=True to the model.compile() method as: actor_model.compile(... , run_eagerly=True)
But after applying run_eagerly to true, I am getting 0 loss value from actor.history['loss'] and to debug this I am not able to print the total_loss value in the ppo_loss(…) function because it gives the AttributeError: 'Tensor' object has no attribute 'numpy'.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (5 by maintainers)
The custom loss function will only work when a
Tensoris returned and not aSymbolic KerasTensororSymbolic Tensor.So the main issue here is that custom loss function is returning a
Symbolic KerasTensorand not aTensor. And this is happening because inputs to the custom loss function are inSymbolic KerasTensorform.So the question is: How to pass the extra parameters to the custom loss function in
Tensorform instead ofSymbolic KerasTensor?Complete Report on Causes
From TensorFlow 2.3.0:
From TensorFlow 2.5.0:
In custom loss function some of the data is in KerasTensor form and others in Tensor form.
So what to do in this case? How to pass the values to the custom loss functions so they are also in the Tensor form?
I may be misunderstanding, but it looks like the documented approach only allows fixed constant parameters, defined when the loss is created. For the OP and myself, we need the loss to be a function of things that change with each training example. An example would be if the loss wanted the model input in addition to y_true and y_pred.
Complete Report on Errors
✅ Works with TensorFlow 1.15.0:
https://colab.research.google.com/drive/1CdTzQZ-2mUeNebAv9qV-kW8WoAHX61lV
❌ Error from TensorFlow 2.3.0:
Standalone code to reproduce the issue with Tensorflow 2.3.0 https://colab.research.google.com/drive/1wa93OGIfXVY66GHO6wI1O9PnZyxXB-KJ
❌ Error from TensorFlow 2.5.0:
Standalone code to reproduce the issue with Tensorflow 2.5.0 https://colab.research.google.com/drive/1dqVHrnqUmD5ZipzMwAJa0XcnZE1fh-HG
Similar error as mentioned in the below issues:
I have already updated the colab notebook with a standard loss function and, it works, so definitely, there is a problem with the custom loss function. I have also updated imports like
kerastotensorflow.python.keras. Still, the error remains.Hi @Dhyeythumar, this is a lot of custom code to look through. But often a cause of this error message is trying to use numpy ops in graph mode, or relying on python side effects (like appending to lists) instead of using TensorArrays in graph mode. I am not very familiar with gym so I can’t point to the exact issue in your code. However, I would suggest taking a look at the tf.function guide to get a sense of common pitfalls in graph mode. To isolate where the issue is, you might want to try running your code with a standard loss and see if it runs without this error.
Setting
run_eagerly=Truewill get rid of the error message, but it’s not a great solution as it can really affect the performance of your code. This should really only be used for debugging purposes.Additionally, you should update your import statements so you’re using
tf.kerasinstead ofkerasegfrom tensorflow.keras.layers import Input, DenseetcYes, I have also mentioned this in the above issue. So when I try to get the value from a tensor in ppo_loss() function I am getting
AttributeError: 'Tensor' object has no attribute 'numpy'error. Basically, I am doing this to check why I am getting 0 in that ppo_loss function.For now, I have updated the code on colab so we won’t get
AttributeError: 'Tensor' object has no attribute 'numpy'.@Dhyeythumar
I have tried in colab with TF version 2.3 and i am seeing different error message(
AttributeError: 'Tensor' object has no attribute 'numpy’).Please, find the gist here.. Thanks!