tensorflow: A check fail can be triggered in LSTMBlockCell
Click to expand!
Issue Type
Bug
Source
binary
Tensorflow Version
tf 2.9 and 2.12.0-dev20221018
Custom Code
No
OS Platform and Distribution
Linux Ubuntu 20.04
Mobile device
No response
Python version
3.8
Bazel version
No response
GCC/Compiler version
No response
CUDA/cuDNN version
CUDA 11.5
GPU model and memory
No response
Current Behaviour?
A crash due to check fail can be trigerred.
Standalone code to reproduce the issue
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
import tensorflow as tf
import numpy as np
print(tf.__version__)
for _ in range(20):
try:
forget_bias = 112.66590343649887
cell_clip = 67.12389445926587
use_peephole = False
x = tf.saturate_cast(tf.random.uniform([2, 16], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
cs_prev = tf.saturate_cast(tf.random.uniform([2, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
h_prev = tf.saturate_cast(tf.random.uniform([2, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
w = tf.saturate_cast(tf.random.uniform([16, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
wci = tf.saturate_cast(tf.random.uniform([5], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
wcf = tf.saturate_cast(tf.random.uniform([16], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
wco = tf.saturate_cast(tf.random.uniform([13], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
b = tf.saturate_cast(tf.random.uniform([0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
res = tf.raw_ops.LSTMBlockCell(
forget_bias=forget_bias,
cell_clip=cell_clip,
use_peephole=use_peephole,
x=x,
cs_prev=cs_prev,
h_prev=h_prev,
w=w,
wci=wci,
wcf=wcf,
wco=wco,
b=b,
)
except:
pass
Relevant log output
F tensorflow/core/kernels/rnn/lstm_ops_gpu.cu.cc:277] Non-OK-status: GpuLaunchKernel( lstm_gates<T, false, gate_layout>, grid_dim_2d, block_dim_2d, 0, cu_stream, gates.data(), b.data(), cs_prev.data(), wci.data(), wcf.data(), wco.data(), o.data(), h.data(), ci.data(), cs.data(), co.data(), i.data(), f.data(), forget_bias, cell_clip, batch_size, cell_size) status: INTERNAL: invalid configuration argument
Aborted (core dumped)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 36 (17 by maintainers)
Folks, the way TF (and most OSS projects) work is whoever sends the PR that does the job gets the work. Especially for big projects it is very hard to keep assigning contributors to issues and then unassign when they turn inactive.
E.g. check https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/sparse_to_dense_op.cc
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/sparse_to_dense_op_gpu.cu.cc#L121
import os os.environ[‘TF_ENABLE_ONEDNN_OPTS’] = ‘0’ import tensorflow as tf import numpy as np
Set random values for inputs
forget_bias = 112.66590343649887 cell_clip = 67.12389445926587 use_peephole = False x = tf.saturate_cast(tf.random.uniform([2, 16], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) cs_prev = tf.saturate_cast(tf.random.uniform([2, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) h_prev = tf.saturate_cast(tf.random.uniform([2, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) w = tf.saturate_cast(tf.random.uniform([16, 0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) wci = tf.saturate_cast(tf.random.uniform([5], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) wcf = tf.saturate_cast(tf.random.uniform([16], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) wco = tf.saturate_cast(tf.random.uniform([13], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half) b = tf.saturate_cast(tf.random.uniform([0], minval=0, maxval=64, dtype=tf.int64), dtype=tf.half)
Run LSTMBlockCell operation in a loop for 20 iterations
for _ in range(20): try: res = tf.raw_ops.LSTMBlockCell( forget_bias=forget_bias, cell_clip=cell_clip, use_peephole=use_peephole, x=x, cs_prev=cs_prev, h_prev=h_prev, w=w, wci=wci, wcf=wcf, wco=wco, b=b, ) except: pass
This will be the corrected code for this issue.
So probably in this case is
OP_REQUIRES_OKWhile these are internal APIs, the check failure could lead to denial of service. In the past this used to be considered a vulnerability, but since the impact is bounded and preventable, now the policy is for these to be files as issues and resolved whenever someone from community wants to pick them up (they’re very good first issues). Please don’t auto-close these!