tensorflow: Grappler error when Softmax input has a variable dimension

System information

  • Have I written custom code: Yes
  • OS Platform and Distribution: Ubuntu 18.04
  • TensorFlow installed from: binary
  • TensorFlow version: 2.5.0
  • Python version: 3.6.9
  • CUDA/cuDNN version: 11.2/8.1.0
  • GPU model and memory: NVIDIA Quadro P620 (4GB)

Describe the current behavior

The grappler pass is logging an error when Softmax is used in a tf.function with one or more variable dimensions (e.g. a variable batch size). This log started appearing in TensorFlow 2.5.

It does not seem to cause issues when running the model, but it might indicate a bug in the grappler implementation.

Describe the expected behavior

This operation should not produce any warning or error.

Contributing

  • Do you want to contribute a PR? No
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue

import tensorflow as tf

input_signature = (tf.TensorSpec([None, 20], tf.float32),)
softmax = tf.function(tf.nn.softmax, input_signature=input_signature)
softmax(tf.random.uniform([2, 20]))

Other info / logs

The code above logs the following error:

2021-07-02 07:34:12.478300: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_FLOAT } } inputs { dtype: DT_FLOAT shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Quadro P620" frequency: 1442 num_cores: 4 environment { key: "architecture" value: "6.1" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 524288 shared_memory_size_per_multiprocessor: 98304 memory_size: 3092316160 bandwidth: 96128000 } outputs { dtype: DT_FLOAT shape { unknown_rank: true } }

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 13
  • Comments: 28 (9 by maintainers)

Most upvoted comments

I experience the same error when using models with attention

The warning still exists in TensorFlow 2.7.

I am getting this issue on Tensorflow 2.13 (Ubuntu 22.04, python3.10, Cuda 11.8)

Is this just a warning or the does this affect the performance of the model. I’m seeing the similar warning message while using Attention layer on GPUs while doing Hyper-Parameter tuning using Keras-Tuner. Can someone please clarify that? Thank you!

I think this issue is still present at least in the C++ API in Windows 10 with TensorFlow 2.9.2 (build from source, obviously). We haven’t observed degraded performance in our internal tests, so I’ll just leave this here so others can know about it.

It seems this was fixed in TensorFlow 2.8. The initial code snippet no longer shows a warning.

@tilakrayal Feel free to close this issue.

Same warning in here with softmax.