nni: EmptyLayerError() or UnBalancedGroupError() during pruning depthwise separable convolution

Describe the issue: Errors when pruning depthwise separable convolution

when ‘total_sparsity’ has different value , it may encounter error as follows:

EmptyLayerError

    raise EmptyLayerError()
nni.compression.pytorch.speedup.error_code.EmptyLayerError: Pruning a Layer to empty is not legal

UnBalancedGroupError

    raise UnBalancedGroupError()
nni.compression.pytorch.speedup.error_code.UnBalancedGroupError: The number remained filters in each group is different

#4648 and #4796 seems solve the error, but it doesn’t work in my code.

How can i solve it? Any help would be greatly appreciated, thanks!

Environment:

NNI version: 2.8
Training service (local|remote|pai|aml|etc):
Client OS: ubuntu 18.04
Server OS (for remote mode only):
Python version: 3.8
PyTorch/TensorFlow version: pytorch=1.10.1
Is conda/virtualenv/venv used?: conda
Is running in Docker?: no

How to reproduce it?:

import torch
from nni.compression.pytorch.pruning import L1NormPruner, L2NormPruner
from nni.compression.pytorch.speedup import ModelSpeedup
from torch import nn

model = nn.Sequential(
    nn.Conv2d(128, 128, (3, 3), (1, 1), 1, groups=128, bias=False),
    nn.BatchNorm2d(128),
    nn.ReLU(inplace=True),
)

config_list = [{'total_sparsity': 0.6, 'op_types': ['Conv2d']}]

dummy_input = torch.rand(5, 128, 256, 256)
pruner = L2NormPruner(model, config_list, mode='dependency_aware', dummy_input=dummy_input)
_, masks = pruner.compress()
pruner._unwrap_model()
ModelSpeedup(model, dummy_input, masks).speedup_model()

print(model)

Log message: [2022-07-08 11:41:49] start to speedup the model [2022-07-08 11:41:53] infer module masks… [2022-07-08 11:41:53] Update mask for 0 [2022-07-08 11:41:55] Update mask for 1 [2022-07-08 11:41:57] Update mask for 2 [2022-07-08 11:41:59] Update the indirect sparsity for the 2 /home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won’t be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at /opt/conda/conda-bld/pytorch_1639180523671/work/build/aten/src/ATen/core/TensorBody.h:417.) return self._grad [2022-07-08 11:41:59] Update the indirect sparsity for the 1 [2022-07-08 11:42:00] Update the indirect sparsity for the 0 [2022-07-08 11:42:01] resolve the mask conflict [2022-07-08 11:42:01] replace compressed modules… [2022-07-08 11:42:01] replace module (name: 0, op_type: Conv2d) Traceback (most recent call last): File “ttttt.py”, line 35, in <module> ModelSpeedup(model, dummy_input, masks).speedup_model() File “/home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py”, line 543, in speedup_model self.replace_compressed_modules() File “/home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py”, line 402, in replace_compressed_modules self.replace_submodule(unique_name) File “/home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compressor.py”, line 473, in replace_submodule compressed_module = replace_function( File “/home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compress_modules.py”, line 16, in <lambda> ‘Conv2d’: lambda module, masks: replace_conv2d(module, masks), File “/home/sf/anaconda3/envs/nni_py38/lib/python3.8/site-packages/nni/compression/pytorch/speedup/compress_modules.py”, line 424, in replace_conv2d raise UnBalancedGroupError() nni.compression.pytorch.speedup.error_code.UnBalancedGroupError: The number remained filters in each group is different

About this issue

Original URL
State: open
Created 2 years ago
Comments: 15 (9 by maintainers)

Most upvoted comments

sorry no progress so far. it’s hard to fix, and we are now refactoring ModelSpeedup in 3.0. then we will try to fix it.

Louis-J on Nov 16, 2022

reproduced the bug. need some time to fix it

Louis-J on Jul 26, 2022