Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

Authors: Dongyue Wu, Zilin Guo, Li Yu, Nong Sang, Changxin Gao

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the effectiveness of our method, we conducted extensive comparative experiments on various challenging datasets. The results demonstrate the superior performance of SIRFP for semantic segmentation tasks. For example, Figure 1 shows that the proposed SIRFP achieves the state-of-the-art trade-off between the number of parameters and m Io U performance on the Cityscapes dataset compared with both pruning methods and recent real-time segmentation methods. Our method is also evaluated on object detection and image classification tasks, showing decent generalization ability. Experiments and Analysis Although this paper mainly focuses on semantic segmentation, to comprehensively evaluate the efficacy and generality of our method, we also conduct experiments on a variety of computer vision tasks, including image classification and object detection.
Researcher Affiliation Academia 1National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artifcial Intelligence and Automation, Huazhong University of Science and Technology 2School of Electronic Information and Communications, Huazhong University of Science and Technology EMAIL
Pseudocode Yes Algorithm 1: Efficient Heuristic Greedy Pruning Input : Weight of the l-th layer Wl RCl Cl 1 where wl i,: is the weight of i-th channel, the graph G = (V, E), and the channel vertices set V = {1, 2, ..., Cl} Output: The weight Wl p RCl p Cl 1 after pruning 1 Get the edge weight matrix Al 2 Set the vertices set Vr to be removed as empty set 3 Set the pruning index k = 1, counter t = 1 // Calculate sum of edge weight: 4 for i V do 5 si P j V \{i} aij 6 if sk>si then 7 k i // Get the least sum // Start iterative pruning: 8 for each prune iteration tp [b] do 9 Vr Vr + {k} // Prune the selected k 10 q random(V Vr) 11 for i V Vr do // Update sum for next iteration 12 si si aik 13 if sq>si then 14 q i // Get the least sum 15 k q; t t + 1 16 Wl p Concate( [wl i:, for i V Vr] ) 17 return Wl p
Open Source Code Yes Code https://github.com/dywu98/SIRFP.git
Open Datasets Yes Following DCFP, aside from the popular but rather easy Cityscapes (Cordts et al. 2016) dataset (19 categories), we also evaluate our method on the challenging ADE20k (Zhou et al. 2017) (150 categories) and COCO stuff-10K (Caesar, Uijlings, and Ferrari 2018) (171 categories). For image classification and object detection, we compare the proposed method with other state-of-the-art pruning methods on Image Net (Deng et al. 2009) and COCO2017 dataset (Lin et al. 2014), respectively.
Dataset Splits Yes Comparison on Cityscapes To compare our method with other popular pruning algorithms including NS (Liu et al. 2017), Taylor (Molchanov et al. 2019), Depgraph (Fang et al. 2023), FPGM (He et al. 2019), and DCFP (Wang et al. 2024), we reproduce these methods and prune the Deeplab V3-resnet50 model on Cityscapes. The results are shown in Table 1. ... Comparison on ADE20K and COCO stuff-10k The comparison with existing pruning methods on ADE20K and COCO stuff-10K are reported in Table 3... Experimental results conducted on the COCO2017 dataset are reported in Table 4. ... Hence, we conduct experiments on the Image Net dataset and the results are reported in Table 5.
Hardware Specification Yes The experiments run on the Pytorch framework using NVIDIA RTX 3090 GPUs.
Software Dependencies No The experiments run on the Pytorch framework using NVIDIA RTX 3090 GPUs. More implementation details, ablation studies, and visualization results are presented in the Appendix.
Experiment Setup Yes Following DCFP, all our experiments adopt the GSRL to improve performance. ... We prune the model and finetune it for a few iterations to restore the performance, which makes the pruning less destructive. This can also reduce the duration of accumulating edge weight matrix and lead to an accurate reflection of the feature redundancy between channels. ... α is the hyper-parameter of EMA, which is set to 0.99.