tensorflow: Depthwise convolution inside Dataset API throws data_format error

System information

Linux Ubuntu 16.04
TensorFlow installed using: pip
TensorFlow version: 1.10
Python version: 3.6
CUDA/cuDNN version: 9.0/7.2
GPU model and memory: GeForce GTX 1080 ti

I tried to reproduce the error in a simpler fashion but couldn’t manage it. Basically I use dataset API to load some patches from tfrecords and apply a synthethic blur on it (loaded from another tfrrecords) similar to this:

import tensorflow as tf
import numpy as np

def apply_blur(img):
    blur = np.random.rand(3,3,1,1)
    img = tf.nn.depthwise_conv2d(img[None], blur, [1,1,1,1], 'VALID')
    return img

tf.reset_default_graph()
dataset = tf.data.Dataset.from_tensor_slices(np.ones((10, 128, 128, 1)))
dataset = dataset.map(apply_blur, 2)

iterator = dataset.make_one_shot_iterator()
batch = iterator.get_next()

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    out = sess.run(batch)

My code is still working with tf v1.8, but not in higher versions. It says the data_format should be “NCHW” for Depthwise convolution on CPU, while in fact the data_format is “NCHW”.

Source code / logs

Traceback (most recent call last):
  File "cli_deblurring.py", line 95, in <module>
    cli()
  File "/home/user/miniconda3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/miniconda3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/user/miniconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/miniconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "cli_deblurring.py", line 90, in cli
    model.train()
  File "/home/user/Project/model/__init__.py", line 159, in train
    self._train(sess)
  File "/home/user/Project/model/deblurring.py", line 318, in _train
    epoch, global_step = self._train_epoch(sess)
  File "/home/user/Project/model/deblurring.py", line 291, in _train_epoch
    sess.run(light_fetches, feed_dict=self.train_feed_dict)
  File "/home/user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/home/user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/home/user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.UnimplementedError: Depthwise convolution on CPU is only supported for NHWC format
         [[Node: depthwise_3 = DepthwiseConv2dNative[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1]](depthwise-0-TransposeNHWCToNCHW-LayoutOptimizer, strided_slice_4)]]
         [[Node: load_data/IteratorGetNext = IteratorGetNext[output_shapes=[[?,8,128,128,1], [?,128,128,1]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](load_data/Iterator)]]
         [[Node: load_data/IteratorGetNext/_671 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_64_load_data/IteratorGetNext"
, tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Any idea why this error could happen in higher versions?

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 2
Comments: 21 (8 by maintainers)

Most upvoted comments

I’ve also just run across this - it’s a bit frustrating as this is a shared function and sometimes it does indeed run on the GPU. So guarding it with the tf.device('/cpu:0') isn’t very scalable given that sometimes I do want it to run on the GPU (in other function calls - large shared code base) (Tensorflow 1.14.0)

patricksnape on Jun 26, 2019

@mrry You’re correct. Disabling the layout optimizer did the trick. Based on your second reply, I also tried using with tf.device('/cpu:0'): in the apply_blur(...) and it worked! Full code:

import tensorflow as tf

def apply_blur(img):
    with tf.device('/cpu:0'):
        blur = tf.random_uniform((3,3,1,1))
        img = tf.nn.depthwise_conv2d(img[None], blur, [1,1,1,1], 'VALID')[0]
        return img

tf.reset_default_graph()
dataset = tf.data.Dataset.from_tensor_slices(tf.ones((10, 128, 128, 1)))
dataset = dataset.map(apply_blur, 2)

iterator = dataset.make_one_shot_iterator()
batch = iterator.get_next()

config = tf.ConfigProto()
# this also works
# from tensorflow.core.protobuf import rewriter_config_pb2
# config.graph_options.rewrite_options.layout_optimizer = rewriter_config_pb2.RewriterConfig.OFF
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    out = sess.run(batch)

If you use the with tf.device('/cpu:0'): only on the tf.nn.depthwise_conv2d(...) it works too:

def apply_blur(img):
    blur = tf.random_uniform((3,3,1,1))
    with tf.device('/cpu:0'):
        img = tf.nn.depthwise_conv2d(img[None], blur, [1,1,1,1], 'VALID')
    return img[0]

rodrigoberriel on Oct 25, 2018

@yongtang you ran the code with the workaround. The one that throws the error is the one in the OP’s message with (AFAIK) any operation after the tf.nn.depthwise_conv2d. Anyway, I just noticed you said tf-nightly instead of tf-nightly-gpu, i.e., if it is not a typo, you’re running TensorFlow for CPU-only. This problem only occurs when using TensorFlow for GPU (tensorflow-gpu and tf-nightly-gpu). I just tested on tf-nightly-gpu (version b'v1.12.0-rc0-963-gbcfcb4d765' 1.13.0-dev20181016) and the error persists. Full code, just in case:

import tensorflow as tf
import numpy as np

def apply_blur(img):
    blur = np.random.rand(3,3,1,1)
    img = tf.nn.depthwise_conv2d(img[None], blur, [1,1,1,1], 'VALID')[0]
    return img

tf.reset_default_graph()
dataset = tf.data.Dataset.from_tensor_slices(np.ones((10, 128, 128, 1)))
dataset = dataset.map(apply_blur, 2)

iterator = dataset.make_one_shot_iterator()
batch = iterator.get_next()

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
    out = sess.run(batch)

rodrigoberriel on Oct 16, 2018