This repository provides the official implementation of IDAP++, a novel neural network compression approach that unifies both filter-level (width) and architecture-level (depth) pruning through information flow divergence analysis. The proposed method establishes a unified approach applicable to diverse neural architectures, including convolutional networks and transformer-based models.
We propose the first pruning methodology that systematically optimizes neural networks along both width (filter-level) and depth (layer-level) dimensions through a unified flow-divergence criterion. The framework combines:
- Divergence-Aware Filter Pruning (IDAP)
- Flow-Guided Layer Truncation
- Python 3.10+
- PyTorch 2.0+
- CUDA-compatible GPU
- Other dependencies listed in
requirements.txt
- Clone the repository:
git clone https://github.com/user852154/divergence_aware_pruning.git
cd divergence_aware_pruning- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate- Install dependencies:
pip install -r requirements.txt- Pruning Results for Different Architectures Using IDAP++: Base vs. Pruned Models (Acc@1, GFlops, Δ%)
The table below presents the outcomes of our experiments, offering a comparative analysis of pruning across various model architectures and datasets. It reports top-1 accuracy (Acc@1) for both the original and pruned models, along with their computational cost measured in GFlops. The Δ% columns indicate the relative changes in accuracy and computational complexity resulting from pruning.
| Architecture | Dataset | Acc@1 Base | Acc@1 Pruned | Δ% | GFlops Base | GFlops Pruned | Δ% |
|---|---|---|---|---|---|---|---|
| ResNet-50 | ImageNet | 76,13 | 74,62 | −1,99 | 4,1 | 1,5 | −63 |
| CIFAR-100 | 86,61 | 84,18 | −2,80 | 4,1 | 1,2 | −71 | |
| CIFAR-10 | 98,20 | 95,98 | −2,26 | 4,1 | 1,1 | −72 | |
| Stanford Cars | 92,52 | 90,14 | −2,57 | 4,1 | 1,2 | −70 | |
| Flowers-102 | 97,91 | 96,75 | −1,19 | 4,1 | 1,5 | −64 | |
| iNaturalist | 76,14 | 74,49 | −2,17 | 4,1 | 1,4 | −65 | |
| Food101 | 90,45 | 88,58 | −2,07 | 4,1 | 1,3 | −67 | |
| Oxford-IIIT Pet | 93,12 | 92,19 | −1,00 | 4,1 | 1,4 | −65 | |
| Fashion MNIST | 93,18 | 91,79 | −1,49 | 4,1 | 0,8 | −80 | |
| FER2013 | 71,80 | 69,52 | −3,18 | 4,1 | 1,3 | −67 | |
| EfficientNet-B4 | ImageNet | 83,38 | 81,85 | −1,84 | 4,2 | 1,5 | −65 |
| CIFAR-100 | 90,12 | 88,07 | −2,27 | 4,2 | 1,5 | −65 | |
| CIFAR-10 | 96,91 | 95,52 | −1,44 | 4,2 | 1,3 | −70 | |
| Stanford Cars | 91,34 | 89,06 | −2,50 | 4,2 | 1,4 | −68 | |
| Flowers-102 | 96,91 | 95,50 | −1,46 | 4,2 | 1,5 | −63 | |
| iNaturalist | 70,58 | 68,72 | −2,64 | 4,2 | 1,3 | −68 | |
| Food101 | 91,23 | 88,91 | −2,54 | 4,2 | 1,5 | −65 | |
| Oxford-IIIT Pet | 87,85 | 85,71 | −2,43 | 4,2 | 1,6 | −61 | |
| Fashion MNIST | 94,98 | 93,27 | −1,80 | 4,2 | 1,4 | −66 | |
| FER2013 | 74,17 | 72,23 | −2,61 | 4,2 | 1,4 | −68 | |
| ViT-Base/16 | ImageNet | 81,07 | 79,49 | −1,95 | 17,5 | 6,3 | −64 |
| CIFAR-100 | 94,25 | 92,19 | −2,19 | 17,5 | 5,8 | −67 | |
| CIFAR-10 | 98,61 | 96,99 | −1,64 | 17,5 | 4,3 | −75 | |
| Stanford Cars | 93,74 | 91,05 | −2,87 | 17,5 | 5,1 | −71 | |
| Flowers-102 | 95,53 | 94,56 | −1,01 | 17,5 | 5,5 | −68 | |
| iNaturalist | 68,65 | 67,16 | −2,17 | 17,5 | 6,8 | −61 | |
| Food101 | 87,41 | 85,00 | −2,76 | 17,5 | 6,5 | −63 | |
| Oxford-IIIT Pet | 89,57 | 87,32 | −2,51 | 17,5 | 4,9 | −72 | |
| Fashion MNIST | 92,83 | 90,81 | −2,18 | 17,5 | 6,5 | −63 | |
| FER2013 | 70,21 | 67,95 | −3,23 | 17,5 | 6,0 | −66 | |
| MobileNetV3-Large | ImageNet | 74,04 | 72,05 | −2,68 | 0,2 | 0,1 | −67 |
| CIFAR-100 | 77,70 | 76,04 | −2,13 | 0,2 | 0,1 | −63 | |
| CIFAR-10 | 89,81 | 88,56 | −1,40 | 0,2 | 0,1 | −68 | |
| Stanford Cars | 83,87 | 82,37 | −1,79 | 0,2 | 0,1 | −66 | |
| Flowers-102 | 90,02 | 88,68 | −1,48 | 0,2 | 0,1 | −64 | |
| iNaturalist | 68,32 | 67,16 | −1,70 | 0,2 | 0,1 | −66 | |
| Food101 | 87,42 | 85,59 | −2,09 | 0,2 | 0,1 | −72 | |
| Oxford-IIIT Pet | 85,54 | 83,33 | −2,59 | 0,2 | 0,1 | −68 | |
| Fashion MNIST | 92,74 | 90,60 | −2,31 | 0,2 | 0,1 | −73 | |
| FER2013 | 69,87 | 67,79 | −2,98 | 0,2 | 0,1 | −63 | |
| DenseNet-121 | ImageNet | 74,65 | 73,84 | −1,08 | 2,8 | 0,9 | −68 |
| CIFAR-100 | 72,07 | 70,11 | −2,72 | 2,8 | 0,9 | −69 | |
| CIFAR-10 | 94,21 | 92,84 | −1,46 | 2,8 | 0,7 | −74 | |
| Stanford Cars | 83,14 | 81,06 | −2,50 | 2,8 | 0,9 | −70 | |
| Flowers-102 | 91,03 | 88,75 | −2,51 | 2,8 | 0,8 | −70 | |
| iNaturalist | 69,74 | 67,94 | −2,57 | 2,8 | 0,8 | −71 | |
| Food101 | 87,34 | 84,87 | −2,82 | 2,8 | 0,8 | −72 | |
| Oxford-IIIT Pet | 85,23 | 83,59 | −1,92 | 2,8 | 0,7 | −76 | |
| Fashion MNIST | 93,01 | 90,88 | −2,29 | 2,8 | 0,9 | −66 | |
| FER2013 | 65,13 | 63,13 | −3,07 | 2,8 | 0,8 | −71 | |
| ConvNeXt-Small | ImageNet | 83,61 | 81,21 | −2,87 | 8,6 | 2,6 | −70 |
| CIFAR-100 | 85,58 | 83,36 | −2,59 | 8,6 | 2,2 | −74 | |
| CIFAR-10 | 94,21 | 92,00 | −2,35 | 8,6 | 2,3 | −74 | |
| Stanford Cars | 82,19 | 80,77 | −1,72 | 8,6 | 2,8 | −68 | |
| Flowers-102 | 90,09 | 88,44 | −1,84 | 8,6 | 3,5 | −59 | |
| iNaturalist | 68,90 | 67,53 | −1,98 | 8,6 | 3,3 | −61 | |
| Food101 | 86,05 | 84,33 | −2,00 | 8,6 | 3,1 | −64 | |
| Oxford-IIIT Pet | 84,08 | 82,18 | −2,26 | 8,6 | 2,9 | −67 | |
| Fashion MNIST | 93,01 | 90,85 | −2,32 | 8,6 | 2,6 | −69 | |
| FER2013 | 76,10 | 74,05 | −2,70 | 8,6 | 2,7 | −68 | |
| VGG19-BN | ImageNet | 74,22 | 72,64 | −2,13 | 19,6 | 6,8 | −65 |
| CIFAR-100 | 73,89 | 71,38 | −3,40 | 19,6 | 5,9 | −70 | |
| CIFAR-10 | 93,45 | 91,89 | −1,67 | 19,6 | 4,8 | −76 | |
| Stanford Cars | 88,12 | 86,54 | −1,80 | 19,6 | 6,2 | −68 | |
| Flowers-102 | 92,34 | 90,99 | −1,46 | 19,6 | 5,5 | −72 | |
| iNaturalist | 67,21 | 65,77 | −2,15 | 19,6 | 6,1 | −69 | |
| Food101 | 85,67 | 83,39 | −2,66 | 19,6 | 5,8 | −70 | |
| Oxford-IIIT Pet | 86,45 | 83,93 | −2,91 | 19,6 | 5,6 | −71 | |
| Fashion MNIST | 91,78 | 89,48 | −2,51 | 19,6 | 5,5 | −72 | |
| FER2013 | 68,34 | 66,68 | −2,43 | 19,6 | 6,8 | −65 | |
| ShuffleNet V2 x2.0 | ImageNet | 76,23 | 74,40 | −2,40 | 0,5 | 0,2 | −63 |
| CIFAR-100 | 75,32 | 73,14 | −2,89 | 0,5 | 0,2 | −63 | |
| CIFAR-10 | 90,45 | 88,66 | −1,98 | 0,5 | 0,1 | −83 | |
| Stanford Cars | 82,56 | 80,45 | −2,56 | 0,5 | 0,2 | −61 | |
| Flowers-102 | 89,12 | 87,78 | −1,50 | 0,5 | 0,2 | −63 | |
| iNaturalist | 66,78 | 65,35 | −2,15 | 0,5 | 0,2 | −67 | |
| Food101 | 84,23 | 82,30 | −2,29 | 0,5 | 0,2 | −64 | |
| Oxford-IIIT Pet | 83,67 | 81,79 | −2,25 | 0,5 | 0,2 | −66 | |
| Fashion MNIST | 90,89 | 89,08 | −2,00 | 0,5 | 0,1 | −83 | |
| FER2013 | 67,45 | 65,55 | −2,82 | 0,5 | 0,2 | −64 | |
| Inception-v3 | ImageNet | 77,17 | 75,77 | −1,81 | 5,7 | 1,9 | −67 |
| CIFAR-100 | 82,15 | 79,30 | −3,47 | 5,7 | 1,5 | −73 | |
| CIFAR-10 | 95,32 | 93,84 | −1,56 | 5,7 | 1,4 | −76 | |
| Stanford Cars | 88,76 | 87,06 | −1,92 | 5,7 | 1,7 | −70 | |
| Flowers-102 | 93,45 | 92,17 | −1,37 | 5,7 | 2,0 | −65 | |
| iNaturalist | 72,34 | 70,62 | −2,37 | 5,7 | 2,0 | −66 | |
| Food101 | 88,12 | 85,57 | −2,90 | 5,7 | 1,8 | −68 | |
| Oxford-IIIT Pet | 89,34 | 87,00 | −2,61 | 5,7 | 1,8 | −68 | |
| Fashion MNIST | 92,78 | 90,96 | −1,96 | 5,7 | 1,3 | −77 | |
| FER2013 | 70,45 | 68,57 | −2,67 | 5,7 | 1,9 | −67 | |
| EfficientNetV2-S | ImageNet | 84,22 | 82,54 | −2,00 | 8,8 | 3,1 | −65 |
| CIFAR-100 | 88,45 | 85,90 | −2,88 | 8,8 | 2,9 | −67 | |
| CIFAR-10 | 97,12 | 95,72 | −1,45 | 8,8 | 2,4 | −72 | |
| Stanford Cars | 90,23 | 88,78 | −1,60 | 8,8 | 2,9 | −67 | |
| Flowers-102 | 96,78 | 95,55 | −1,27 | 8,8 | 3,3 | −63 | |
| iNaturalist | 75,67 | 73,43 | −2,96 | 8,8 | 3,0 | −66 | |
| Food101 | 90,56 | 88,98 | −1,74 | 8,8 | 3,2 | −64 | |
| Oxford-IIIT Pet | 89,12 | 87,52 | −1,79 | 8,8 | 3,2 | −63 | |
| Fashion MNIST | 95,34 | 93,47 | −1,96 | 8,8 | 2,7 | −70 | |
| FER2013 | 76,89 | 74,70 | −2,85 | 8,8 | 2,8 | −68 |
- Comparative Accuracy of Our Method and Prior Pruning Techniques on CIFAR-10
The table below presents a comparison between our method and other pruning techniques on the CIFAR-10 dataset, where around 60% of the model weights are removed. The results show that our approach achieves comparable weight reduction while preserving higher accuracy than alternative methods.
- Model Compression Dynamics of ResNet-50 on CIFAR-10 Using the Two-Stage IDAP++ Framework
The table below demonstrates the pruning dynamics of the ResNet-50 model on the CIFAR-10 dataset using our IDAP++ algorithm over 35 pruning steps. The results show the gradual reduction in model parameters and computational complexity while maintaining high accuracy throughout most of the pruning process.
| Pruning Step | Stage | Params (M) | GFlops | Top-1 Acc. (%) | Top-5 Acc. (%) | Δ Top-1 Acc. |
|---|---|---|---|---|---|---|
| 1 | Baseline | 23.53 | 4.09 | 98.20 | 99.86 | 0.00 |
| 2 | Filter Prune | 22.27 | 3.89 | 97.66 | 99.85 | -0.54 |
| 3 | Filter Prune | 21.20 | 3.66 | 97.23 | 99.84 | -0.97 |
| 4 | Filter Prune | 19.89 | 3.46 | 96.99 | 99.73 | -1.21 |
| 5 | Filter Prune | 18.78 | 3.31 | 97.11 | 99.89 | -1.09 |
| 6 | Filter Prune | 17.54 | 3.13 | 97.74 | 99.89 | -0.46 |
| 7 | Filter Prune | 16.45 | 2.90 | 97.62 | 99.84 | -0.58 |
| 8 | Filter Prune | 15.50 | 2.73 | 97.93 | 99.87 | -0.27 |
| 9 | Filter Prune | 14.62 | 2.61 | 98.09 | 99.76 | -0.11 |
| 10 | Filter Prune | 14.14 | 2.52 | 98.05 | 99.75 | -0.15 |
| 11 | Filter Prune | 13.50 | 2.37 | 97.87 | 99.77 | -0.33 |
| 12 | Filter Prune | 12.98 | 2.26 | 97.85 | 99.81 | -0.35 |
| 13 | Filter Prune | 12.37 | 2.15 | 97.84 | 99.77 | -0.36 |
| 14 | Filter Prune | 11.82 | 2.08 | 97.77 | 99.79 | -0.43 |
| 15 | Filter Prune | 11.26 | 1.98 | 97.70 | 99.76 | -0.50 |
| 16 | Filter Prune | 11.02 | 1.94 | 97.85 | 99.80 | -0.35 |
| 17 | Filter Prune | 10.77 | 1.89 | 97.56 | 99.81 | -0.64 |
| 18 | Filter Prune | 10.53 | 1.85 | 97.50 | 99.79 | -0.70 |
| 19 | Filter Prune | 10.28 | 1.81 | 97.42 | 99.80 | -0.78 |
| 20 | Filter Prune | 10.04 | 1.77 | 97.35 | 99.78 | -0.85 |
| 21 | Filter Prune | 9.79 | 1.73 | 97.28 | 99.75 | -0.92 |
| 22 | Filter Prune | 9.55 | 1.68 | 97.50 | 99.77 | -0.70 |
| 23 | Filter Prune | 9.30 | 1.49 | 97.52 | 99.78 | -0.68 |
| 24 | Filter Prune | 9.05 | 1.45 | 97.08 | 99.77 | -1.12 |
| 25 | Filter Prune | 8.81 | 1.40 | 97.50 | 99.80 | -0.70 |
| 26 | Filter Prune | 8.56 | 1.34 | 97.40 | 99.81 | -0.80 |
| 27 | Filter Prune | 8.32 | 1.30 | 96.91 | 99.79 | -1.29 |
| 28 | Filter Prune | 8.07 | 1.26 | 97.25 | 99.78 | -0.95 |
| 29 | Filter Prune | 7.83 | 1.22 | 97.52 | 99.80 | -0.68 |
| 30 | Filter Prune | 7.57 | 1.19 | 97.63 | 99.81 | -0.57 |
| 31 | Layer Trunc | 6.73 | 1.17 | 97.22 | 99.39 | -0.98 |
| 32 | Layer Trunc | 6.67 | 1.16 | 96.78 | 98.94 | -1.42 |
| 33 | Layer Trunc | 6.62 | 1.15 | 96.42 | 98.57 | -1.78 |
| 34 | Layer Trunc | 6.56 | 1.14 | 95.57 | 98.03 | -2.63 |
| 35 | Final Fine-Tune | 6.56 | 1.14 | 95.98 | 98.12 | -2.22 |
- Inference Time Summary by Architecture (RTX 3060, Batch Size = 1, FP32)
The table below presents a comparison of inference times before and after pruning for various neural network architectures. It includes measurements of the base (unpruned) and pruned inference times in milliseconds, as well as the resulting speedup factor achieved through pruning. The results show that across all tested models, pruning leads to a notable reduction in inference time, with speedup factors ranging from 1.57× (EfficientNetV2-S) to 2.16× (MobileNetV3-Large).
| Architecture | Inference Time Base (ms) | Inference Time Pruned (ms) | Speedup × |
|---|---|---|---|
| ResNet-50 | 8,5 | 4,3 | 1,98 |
| EfficientNet-B4 | 8,8 | 4,6 | 1,91 |
| ViT-Base/16 | 33,2 | 20,3 | 1,64 |
| MobileNetV3-Large | 4,1 | 1,9 | 2,16 |
| DenseNet-121 | 6,2 | 3,3 | 1,88 |
| ConvNeXt-Small | 17,5 | 10,5 | 1,67 |
| VGG19-BN | 38,2 | 18,0 | 2,12 |
| ShuffleNet V2 x2.0 | 3,5 | 1,8 | 1,94 |
| Inception-v3 | 11,6 | 5,5 | 2,11 |
| EfficientNetV2-S | 17,4 | 11,1 | 1,57 |
The figures below illustrate the training dynamics of ResNet-50 on the CIFAR-10 dataset, showing how various metrics evolve during the pruning process. The plots demonstrate the changes in computational complexity (GFLOPs), parameter count, and Top-1 accuracy across pruning steps, providing a comprehensive view of the model's behavior during optimization.
To reproduce the results reported in our paper:
- Follow the installation instructions above
- Download the preprocessed datasets using the provided scripts
- Run the training and evaluation scripts
- Use plot_training_metrics.py script to generate training dynamics plots and metrics visualization
We would like to express our gratitude to the following sources for providing pre-trained models that were used in this research:
- The authors of "ResNet strikes back: An improved training procedure in timm" (Wightman et al., 2021) for their foundational work on ResNet architectures;
- The authors of "Which backbone to use: A resource-efficient domain specific comparison for computer vision" (Jeevan & Sethi, 2024) for their contributions to efficient model architectures;
- The authors of "DepGraph: Towards any structural pruning" (Fang et al., 2023) for their codebase for the structural pruning;
- The PyTorch Vision team for their comprehensive model zoo (https://docs.pytorch.org/vision/0.19/models).
This project is licensed under the MIT License - see the LICENSE file for details.




