reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle

Authors: Lichuan Xiang, Quan Nguyen-Tri, Lan-Cuong Nguyen, Hoang Pham, Khoat Than, Long Tran-Thanh, Hongkai Wen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results demonstrate that DPa I significantly outperforms current state-of-the-art Pa I methods on various architectures, such as Convolutional Neural Networks and Vision-Transformers. Code is available at https://github.com/Quan Nguyen-Tri/DPa I.git
Researcher Affiliation	Collaboration	1University of Warwick, 2Hanoi University of Science and Technology, 3FPT Software AI Center, 4Collov Labs
Pseudocode	Yes	Algorithm 1 Differentiable Pruning at Initialization (DPa I)
Open Source Code	Yes	Code is available at https://github.com/Quan Nguyen-Tri/DPa I.git
Open Datasets	Yes	Our main experiments are conducted with CIFAR-10, CIFAR-100, and Tiny-Image Net datasets, where: CIFAR-10 is augmented by normalizing per-channel, randomly flipping horizontally. CIFAR-100 is augmented by normalizing per-channel, randomly flipping horizontally. Tiny-Image Net is augmented by normalizing per channel, cropping to 64x64, and randomly flipping horizontally. We also perform experiments on Image Net-1K (Deng et al., 2009) to verify our methods work on large-scale dataset tasks.
Dataset Splits	Yes	Our main experiments are conducted with CIFAR-10, CIFAR-100, and Tiny-Image Net datasets, where: CIFAR-10 is augmented by normalizing per-channel, randomly flipping horizontally. CIFAR-100 is augmented by normalizing per-channel, randomly flipping horizontally. Tiny-Image Net is augmented by normalizing per channel, cropping to 64x64, and randomly flipping horizontally. We also perform experiments on Image Net-1K (Deng et al., 2009) to verify our methods work on large-scale dataset tasks.
Hardware Specification	Yes	We use Pytorch 1 library and conduct experiments on a single A5000. Each model was trained using three random seeds (0, 1, 2) to ensure robustness, and the model was trained on Nvidia A100.
Software Dependencies	No	We use Pytorch 1 library and conduct experiments on a single A5000.
Experiment Setup	Yes	For training on final sparse network, the hyperparameters are chosen as follows: Table 6: Summary of the architectures, datasets, and hyperparameters used in experiments. Network Dataset Epochs Batch Optimizer Momentum LR LR Drop, Epoch Weight Decay VGG-19 CIFAR-100 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-20 CIFAR-10 160 128 SGD 0.9 0.1 10x, [60,120] 0.0001 Res Net-18 Tiny-Image Net 100 128 SGD 0.9 0.01 10x, [30,60,80] 0.0001