tensorflow: tf.contrib.image.transform crashes under Windows when CUDA is enabled
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes. A minimal example reproducing the bug is provided below.
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 (x64)
- TensorFlow installed from (source or binary):
Binary installed using
pip install tensorflow-gpu
- TensorFlow version (use command below): b’unknown’ 1.6.0 (also tested on b’unknown’ 1.4.0)
- Python version: 3.6
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source): N/A
- CUDA/cuDNN version:
CUDA 9.0.176 / cuDNN 7.0.5 CUDA 8.0.61 / cuDNN 6.14.11 - GPU model and memory: (device: 0, name: GeForce GTX 970M, pci bus id: 0000:01:00.0, compute capability: 5.2, memory: 3.00GiB) (device: 0, name: Quadro M1200, pci bus id: 0000:01:00.0, compute capability: 5.0, memory: 4.00GiB)
- Exact command to reproduce: python example.py (see below)
Describe the problem
The function tf.contrib.image.transform crashes when CUDA is enabled under Windows. It produces the following errors: CUDA_ERROR_ILLEGAL_INSTRUCTION
on tensorflow 1.6 or CUDA_ERROR_LAUNCH_FAILED
on tensorflow 1.4.
However, it functions correctly when CUDA is disabled (by setting CUDA_VISIBLE_DEVICES to -1).
I tested a variation of different parameters (such as varying the batch size, image sizes, and number of channels), but the behavior stays the same. In addition I reproduced the same error on a different machine with an older tensorflow version.
Source code / logs
- Code:
import numpy as np
import tensorflow as tf
batch_size, image_size, channels = 1, 32, 1
data = np.zeros(
shape=(batch_size, image_size, image_size, channels),
dtype=np.float32)
data_node = tf.placeholder(
shape=(batch_size, image_size, image_size, channels),
dtype=tf.float32)
identity = tf.constant([1, 0, 0, 0, 1, 0, 0, 0], dtype=tf.float32)
transform = tf.tile(tf.expand_dims(identity, 0), [batch_size, 1])
data_node_transformed = tf.contrib.image.transform(data_node, transform)
data_t = tf.Session().run([data_node_transformed], feed_dict={data_node: data})
-
Console Output:
Output with tensorflow 1.6 (with GeForce GTX 970M,) :
2018-03-06 15:42:21.578078: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 970M major: 5 minor: 2 memoryClockRate(GHz): 1.038
pciBusID: 0000:01:00.0
totalMemory: 3.00GiB freeMemory: 2.48GiB
2018-03-06 15:42:21.578310: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-06 15:42:21.890290: I C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2192 MB memory) -> physical GPU (device: 0, name: GeForce GTX 970M, pci bus id: 0000:01:00.0, compute capability: 5.2)
2018-03-06 15:42:22.101031: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION
2018-03-06 15:42:22.101032: E C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:1110] could not synchronize on CUDA context: CUDA_ERROR_ILLEGAL_INSTRUCTION ::
2018-03-06 15:42:22.101185: F C:\tf_jenkins\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_event_mgr.cc:203] Unexpected Event status: 1
Output with tensorflow 1.4 (with Quadro M1200):
2018-03-06 15:31:17.800588: I C:\tf_jenkins\home\workspace\rel‑win\M\windows‑gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties:
name: Quadro M1200 major: 5 minor: 0 memoryClockRate(GHz): 1.148
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.35GiB
2018-03-06 15:31:17.803815: I C:\tf_jenkins\home\workspace\rel‑win\M\windows‑gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) ‑> (device: 0, name: Quadro M1200, pci bus id: 0000:01:00.0, compute capability: 5.0)
2018-03-06 15:31:18.731252: E C:\tf_jenkins\home\workspace\rel‑win\M\windows‑gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:1110] could not synchronize on CUDA context: CUDA_ERROR_LAUNCH_FAILED :: No stack trace available
2018-03-06 15:31:18.731258: E C:\tf_jenkins\home\workspace\rel‑win\M\windows‑gpu\PY\36\tensorflow\stream_executor\cuda\cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED
2018-03-06 15:31:18.733458: F C:\tf_jenkins\home\workspace\rel‑win\M\windows‑gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_event_mgr.cc:203] Unexpected Event status: 1
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 9
- Comments: 26 (9 by maintainers)
@Aliyss I recreated a fresh conda env with cdnn, tensorflow gpu1.8, and keras gpu 2.1 installed from anaconda. Before that I downgraded my cuda from 9.2 to 9.0. That solved the issue for me.