Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
Authors: Can Jin, Tianjin Huang, Yihua Zhang, Mykola Pechenizkiy, Sijia Liu, Shiwei Liu, Tianlong Chen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments with 3 network architectures and 8 datasets evidence the substantial performance improvements from VPNs over existing start-of-the-art pruning algorithms. Furthermore, we find that subnetworks discovered by VPNs from pre-trained models enjoy better transferability across diverse downstream scenarios. These insights shed light on new promising possibilities of data-model codesigns for vision model sparsification. |
| Researcher Affiliation | Academia | 1Rutgers University 2Eindhoven University of Technology 3University of Exeter, 4Michigan State University 5University of Oxford 6University of North Carolina at Chapel Hill |
| Pseudocode | Yes | The detailed procedure of VPNs is summarized in the Algorithm in the Appendix. |
| Open Source Code | No | No explicit statement or link for the source code for the methodology described in this paper is provided. The paper mentions downloading models from 'official Pytorch Model Zoo' but this refers to third-party models, not the authors' implementation. |
| Open Datasets | Yes | We then evaluate the effectiveness of VPNs over eight downstream datasets Tiny Image Net (Le and Yang 2015), Stanford Cars (Krause et al. 2013), Oxford Pets (Parkhi et al. 2012), Food101 (Bossard, Guillaumin, and Van Gool 2014), DTD (Cimpoi et al. 2014), Flowers102 (Nilsback and Zisserman 2008), CIFAR10/100 (Krizhevsky, Hinton et al. 2009), respectively. |
| Dataset Splits | Yes | We use three pre-trained network architectures for our experiments Res Net-18 (He et al. 2016), Res Net-50 (He et al. 2016), and VGG-16 (Simonyan and Zisserman 2014), which can be downloaded from official Pytorch Model Zoo. These models are pre-trained on the Image Net-1K (Deng et al. 2009). We then evaluate the effectiveness of VPNs over eight downstream datasets Tiny Image Net (Le and Yang 2015), Stanford Cars (Krause et al. 2013), Oxford Pets (Parkhi et al. 2012), Food101 (Bossard, Guillaumin, and Van Gool 2014), DTD (Cimpoi et al. 2014), Flowers102 (Nilsback and Zisserman 2008), CIFAR10/100 (Krizhevsky, Hinton et al. 2009), respectively. |
| Hardware Specification | No | No specific hardware details (like exact GPU/CPU models or memory) used for running the experiments are provided in the main text. The mention of 'five 80GB A100 GPUs' is in the context of GPT-3, not the authors' experimental setup. |
| Software Dependencies | No | No specific software versions (e.g., Python 3.8, PyTorch 1.9) are provided. The paper mentions 'official Pytorch Model Zoo' but not the version of PyTorch used. |
| Experiment Setup | Yes | As for visual prompts, our default VP design in VPNs employs a pad prompt with an input size of 224 and a pad size of 16. We also use an input size of 224 for all the baselines to ensure a fair comparison. More implementation details are in the Appendix. |