Torch-Pruning: Yolov7 pruned model does not detect anything?

@VainF Thanks for amazing repo. I tried to run inference for yolov7 pruned model on one image, but pruned model did not detect anything (output image without bboxes) https://github.com/VainF/Torch-Pruning/blob/master/benchmarks/prunability/readme.md#3-yolo-v7

python yolov7_detect_pruned.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg

I saw that in the file yolov7_detect_pruned.py you already set ignored_layers

    ################################################################################
    # Pruning
    example_inputs = torch.randn(1, 3, 224, 224).to(device)
    imp = tp.importance.MagnitudeImportance(p=2) # L2 norm pruning

    ignored_layers = []
    from models.yolo import Detect
    for m in model.modules():
        if isinstance(m, Detect):
            ignored_layers.append(m)
    print(ignored_layers)

    iterative_steps = 1 # progressive pruning
    pruner = tp.pruner.MagnitudePruner(
        model,
        example_inputs,
        importance=imp,
        iterative_steps=iterative_steps,
        ch_sparsity=0.5, # remove 50% channels, ResNet18 = {64, 128, 256, 512} => ResNet18_Half = {32, 64, 128, 256}
        ignored_layers=ignored_layers,
    )
    base_macs, base_nparams = tp.utils.count_ops_and_params(model, example_inputs)
    

    pruner.step()
    pruned_macs, pruned_nparams = tp.utils.count_ops_and_params(model, example_inputs)
    print(model)
    print("Before Pruning: MACs=%f G, #Params=%f G"%(base_macs/1e9, base_nparams/1e9))
    print("After Pruning: MACs=%f G, #Params=%f G"%(pruned_macs/1e9, pruned_nparams/1e9))
    ####################################################################################

But I saw in the log file for Detect module before pruning

(105): Detect(
      (m): ModuleList(
        (0): Conv2d(256, 255, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(512, 255, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(1024, 255, kernel_size=(1, 1), stride=(1, 1))
      )
    )

and after pruning

(105): Detect(
      (m): ModuleList(
        (0): Conv2d(128, 255, kernel_size=(1, 1), stride=(1, 1))
        (1): Conv2d(256, 255, kernel_size=(1, 1), stride=(1, 1))
        (2): Conv2d(512, 255, kernel_size=(1, 1), stride=(1, 1))
      )
    )

Could you @VainF check it again? Thanks

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 15 (3 by maintainers)

Most upvoted comments

@AymenBOUGUERRA Thank you so much for information.

furthermore, it seems that the pruning ratio must be (1-(1/2^n)) with 0<n<5 in order to have a speedup in TensorRT, and using such aggressive pruning ratios will require you to not only finetune the model but rather retrain it from scratch as the feature maps are utterly destroyed.

May be we need to find other way such as KD to keep resonable acc. I dont know why some pruned models are slower than the original model.

@aidevmin Hello again,

Yes, I agree with you. I can successfully export .pt to onnx without repameterization

Be careful to always load the model using the provided function in the repo such as attempt_load(), the reason for this is that un-repameterized models have 2 types of weights in the checkpoints “model” and “EMA”, EMA are about 2 to 5 points better than the default weights in terms of mAP and other metrics, the provided function will try to load EMA first and then model of ema is not present.

One interesting thing I found that after removing 1% channels, speed of pruned model (TRT model) is larger than the one before pruning. It is suprised. I will investigate more and inform to you.

I have already encountered this issues, here are the results of my investigation: image

furthermore, it seems that the pruning ratio must be (1-(1/2^n)) with 0<n<5 in order to have a speedup in TensorRT, and using such aggressive pruning ratios will require you to not only finetune the model but rather retrain it from scratch as the feature maps are utterly destroyed.

Don’t hesitate for any question or clarification.

@AymenBOUGUERRA Thanks for detail respone.

I dont think that parametrization is needed as exporting to onnx will apply all of the necessary optimisation on the model as well as exporting the correct model (ema if applicable).

Yes, I agree with you. I can successfully export .pt to onnx without repameterization

One interesting thing I found that after removing 1% channels, speed of pruned model (TRT model) is larger than the one before pruning. It is suprised. I will investigate more and inform to you.

  1. mAP: The performance of pruned yolov7 has not been checked.
  2. Save & Load: Please try tp.state_dict & tp.load_state_dict. This allows us to save the attributes like conv.in_channels into a .pth and re-load the pruned model using an unpruned one.