tensorflow: XLA bug w/ Keras: "Node name contains invalid characters"

System information

  • Have I written custom code: yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
  • TensorFlow installed from (source or binary): pip
  • TensorFlow version (use command below): tensorflow-gpu==1.14.0+nv
  • Python version: 3.6.8
  • CUDA/cuDNN version: 10.1
  • GPU model and memory: RTX 2080 Ti, NVIDIA Driver 418.43

Describe the current behavior

Shows the warning at the predict time: 2019-09-27 11:02:42.377419: W tensorflow/core/common_runtime/process_function_library_runtime.cc:667] Ignoring multi-device function optimization failure: Invalid argument: Node '_arg_segments_ids_input_0_1_0_arg': Node name contains invalid characters

And doesn’t enable XLA optimizations as I can see (performance doesn’t improve)

Describe the expected behavior

There should be no problem at all.

Code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem.

Occured in a real-case keras-bert based code, but reproduces in the following code:

import numpy as np
import tensorflow as tf
from tensorflow.keras.backend import set_session
from tensorflow.keras.layers import Input, Embedding, Add
from tensorflow.keras.models import Model

config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
set_session(tf.Session(config=config))

seq_len = 80
batch_size=5000

tokens_ids_input = Input(shape=(seq_len, ), dtype='int32', name='tokens_ids_input')
segments_ids_input = Input(shape=(seq_len, ), dtype='int32', name='segments_ids_input')

output = Add()([Embedding(input_dim=5000, output_dim=16)(tokens_ids_input), Embedding(input_dim=2, output_dim=16)(segments_ids_input)])

model = Model(inputs=[tokens_ids_input, segments_ids_input], outputs=[output])

model.predict([np.zeros((batch_size, seq_len)), np.zeros((batch_size, seq_len))])

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 21 (6 by maintainers)

Most upvoted comments

I have the same problem on T4. I found it was because my Node name has an underscore “_” at the beggining, in function: bool IsValidNodeName(StringPiece s, bool allow_internal_ops) when allow_internal_ops is false, it throws this error. In my case, the allow_internal_ops depends on tensorflow::GraphConstructorOptions which is not set when calling ConvertGraphDefToGraph. I changed the opts to allow internal ops and then the warning disappeared.