tensorflow: Standalone graph trained using on NCHW data_format giving errors on CPU when testing (running forward pass)

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes,
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 14.04
TensorFlow installed from (source or binary): Binary (.whl file)
TensorFlow version (use command below): 1.2.0-rc1
Python version: 2.7
Bazel version (if compiling from source): N/A
CUDA/cuDNN version: CUDA 8.0.61 CuDNN 5.1.10
GPU model and memory: GeForce GTX 1080
Exact command to reproduce:

Describe the problem

I’ve a standalone graph create by freezing my model/net which has a few slim.conv2d, slim.max_pool2d operations defined with NCHW data format. I’ve used freeze_graph.py utility to create the stand alone graph using a trained checkpoint (trained using NCHW format). When I tested the graph (ran forward pass) on a Ubuntu machine with GPU, it was running very well. But when I ran the same code to test (forward pass) with the same stand alone graph a Ubuntu machine that has no GPU but CPU, I got the following errors:

2017-09-18 17:57:48.890761: E tensorflow/core/common_runtime/executor.cc:644] Executor failed to create kernel. Invalid argument: Default MaxPoolingOp only supports NHWC.
         [[Node: text_box_300/pool1/MaxPool = MaxPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 2, 2], padding="SAME", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/cpu:0"](text_box_300/conv1/conv1_2/Relu)]]
2017-09-18 17:57:48.892768: E main_textd_test.cc:457] Running model failed: Invalid argument: Default MaxPoolingOp only supports NHWC.
         [[Node: text_box_300/pool1/MaxPool = MaxPool[T=DT_FLOAT, data_format="NCHW", ksize=[1, 1, 2, 2], padding="SAME", strides=[1, 1, 2, 2], _device="/job:localhost/replica:0/task:0/cpu:0"](text_box_300/conv1/conv1_2/Relu)]]

I believe the errors are related to https://github.com/tensorflow/tensorflow/issues/2660 Is it possible to modify the stand alone graph that is running well on GPU to make it run successfully on CPU machine ? Is it possible to avoid retraining the model using NHWC format and recreate the standalone model ?

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 23 (3 by maintainers)

Most upvoted comments

Hi @aselle ,

I agree that it is a major issue if there is no way to rewrite graph and checkpoints from NCHW to NHWC and NHWC to NCHW. The tensorflow (v 1.2) official documentation recommended to train graph NCHW and do inference on NHWC here:

The best practice is to build models that work with both NCHW and NHWC as it is common to train using NCHW on GPU, and then do inference with NHWC on CPU.

tumusudheer on Sep 26, 2017