apex: SyncBatchNorm doesn't support 2 dimensions input?

Hi, I’m facing the issue that the program crash when the input for SyncBatchNorm is two dimensions. Here’s the code:

import torch
import apex

model = apex.parallel.SyncBatchNorm(4).cuda()
data = torch.rand((8,4)).cuda()
output = model(data)

When running the code, error raised like this:

Traceback (most recent call last):
  File "syncbn_test.by", line 7, in <module>
    output = model(data)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/apex/parallel/optimized_sync_batchnorm.py", line 81, in forward
    return SyncBatchnormFunction.apply(input, self.weight, self.bias, self.running_mean, self.running_var, self.eps, self.training or not self.track_running_stats, exponential_average_factor, self.process_group, self.channel_last)
  File "/usr/local/lib/python3.5/dist-packages/apex/parallel/optimized_sync_batchnorm_kernel.py", line 27, in forward
    mean, var_biased = syncbn.welford_mean_var(input)
RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2) (maybe_wrap_dim at /pytorch/aten/src/ATen/core/WrapDimMinimal.h:18)

And everthing runs ok when data a 4 dims tensor.

Here is my environment:

Ubuntu 16.04
Python 3.5.2
Pytorch 1.0.1, installed with "pip install torch"
apex is installed with command:
  pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .
cuda 10.0
nvidia driver 410.72

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 19 (5 by maintainers)

Commits related to this issue

[SyncBatchNorm] supporting 2 dimensional input, resolving issue #194 Implementation: for 2d input, switching channel_last flag to true for better memory access pattern in the kernel. — committed to NVIDIA/apex by jjsjann123 5 years ago
[SyncBatchNorm] (#206) supporting 2 dimensional input, resolving issue #194 Implementation: for 2d input, switching channel_last flag to true for better memory access pattern in the kernel. — committed to NVIDIA/apex by jjsjann123 5 years ago

Most upvoted comments

Hello, I’m confused with the same problem. The apex was installed on 16 Sep.

import torch import apex model = apex.parallel.SyncBatchNorm(4).cuda() data = torch.rand((8,4)).cuda() ouput = model(data)

Traceback (most recent call last): File “<stdin>”, line 1, in <module> File “/gpfs01/user_home/zhuying/.conda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 547, in call result = self.forward(*input, **kwargs) File “/gpfs01/user_home/zhuying/.conda/envs/py36/lib/python3.6/site-packages/apex/parallel/optimized_sync_batchnorm.py”, line 85, in forward return SyncBatchnormFunction.apply(input, z, self.weight, self.bias, self.running_mean, self.running_var, self.eps, self.training or not self.track_running_stats, exponential_average_factor, self.process_group, self.channel_last, self.fuse_relu) File “/gpfs01/user_home/zhuying/.conda/envs/py36/lib/python3.6/site-packages/apex/parallel/optimized_sync_batchnorm_kernel.py”, line 27, in forward mean, var_biased = syncbn.welford_mean_var(input) RuntimeError: Dimension out of range (expected to be in range of [-2, 1], but got 2) (maybe_wrap_dim at /tmp/pip-req-build-p5q91txh/c10/core/WrapDimMinimal.h:20)

zhuyingSeu on Sep 24, 2019

I’m facing the same issue, hoping you guys can fix this soon… many thanks!

DTennant on Mar 15, 2019