federated: Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU

I’ve run into an error when trying to run a CNN model with a max-pooling operation in my tff experiment.

I have 2 Nvidia GPU’s on my machine and my tff library was able to utilize them if I use a DNN model without max-pooling.

As you can see with the log below, My script was able to load the 2 GPUs but I get an error which says I’m using the CPU.

2020-04-15 00:12:16.118122: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-15 00:12:16.118155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-15 00:12:16.118206: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-15 00:12:16.118225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-15 00:12:16.118254: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-15 00:12:16.118284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-15 00:12:16.118303: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-15 00:12:16.119743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1
2020-04-15 00:12:16.119789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-15 00:12:16.119811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 1 
2020-04-15 00:12:16.119817: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N N 
2020-04-15 00:12:16.119823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1:   N N 
2020-04-15 00:12:16.120866: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5707 MB memory) -> physical GPU (device: 0, name: GeForce GTX TITAN Black, pci bus id: 0000:04:00.0, compute capability: 3.5)
2020-04-15 00:12:16.121361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 5707 MB memory) -> physical GPU (device: 1, name: GeForce GTX TITAN Black, pci bus id: 0000:83:00.0, compute capability: 3.5)
2020-04-15 00:12:18.113550: E tensorflow/core/common_runtime/executor.cc:654] Executor failed to create kernel. Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node StatefulPartitionedCall/StatefulPartitionedCall/sequential/max_pooling1d/MaxPool}}]]
2020-04-15 00:12:18.120562: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at iterator_ops.cc:611 : Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node StatefulPartitionedCall/StatefulPartitionedCall/sequential/max_pooling1d/MaxPool}}]]
Traceback (most recent call last):
  File "/home/getalp/eks/env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1367, in _do_call
    return fn(*args)
  File "/home/getalp/eks/env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1352, in _run_fn
    target_list, run_metadata)
  File "/home/getalp/eks/env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1445, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node StatefulPartitionedCall/StatefulPartitionedCall/sequential/max_pooling1d/MaxPool}}]]
	 [[subcomputation/StatefulPartitionedCall_1/ReduceDataset]]
	 [[subcomputation/StatefulPartitionedCall_1/ReduceDataset/_42]]
  (1) Invalid argument: Default MaxPoolingOp only supports NHWC on device type CPU
	 [[{{node StatefulPartitionedCall/StatefulPartitionedCall/sequential/max_pooling1d/MaxPool}}]]
	 [[subcomputation/StatefulPartitionedCall_1/ReduceDataset]]
0 successful operations.
0 derived errors ignored.

The model I used is

def create_keras_model():
    return tf.keras.models.Sequential([
        tf.keras.layers.Input(shape=(segment_size, num_input_channels)),
        tf.keras.layers.Conv1D(196,  16, activation='relu'),
        tf.keras.layers.MaxPool1D(4),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(units=1024, activation='relu'),
        tf.keras.layers.Dropout(dropout_rate),
        tf.keras.layers.Dense(activityCount, activation='softmax'),
    ])

Environment:

OS Platform and Distribution: “Debian GNU/Linux 10 (buster)”
Python package versions (e.g., TensorFlow Federated, TensorFlow): TensorFlow 2.1.1, TFF 0.13.1
Python version: 3.7
Bazel version (if building from source):
CUDA/cuDNN version: Cuda: 10.1 / CUDNN 7.6.5
What TensorFlow Federated execution stack are you using?: Default

Expected behavior To execute normally without a max-pooling op error

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 28 (4 by maintainers)

Most upvoted comments

@Sannaraek For the warning, I would suggest you check your input data and cast them to float32. Float32 is better supported in general. We would also double check our support for float64 in TFF.

nightldj on Apr 18, 2020

@Sannaraek I cannot run your colab because of the drive access. This line looks suspicious to me. tf.keras.backend.set_floatx('float64'). Is it necessary for your experiment? Could you try comment it out?

nightldj on Apr 18, 2020