CoNNect: Connectivity-Based Regularization for Structural Pruning of Neural Networks

Authors: Christian P.C. Franssen, Jinyang Jiang, Yijie Peng, Bernd Heidergott

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments demonstrate that Co NNect can improve classical pruning strategies and enhance state-of-the-art one-shot pruners, such as Dep Graph and LLM-pruner.
Researcher Affiliation Academia Christian Franssen EMAIL Vrije Universiteit Amsterdam Jinyang Jiang EMAIL Peking University Yijie Peng EMAIL Peking University Bernd Heidergott EMAIL Vrije Universiteit Amsterdam
Pseudocode No The paper describes methods and formulations but does not present any structured pseudocode or algorithm blocks with explicit labels like "Pseudocode" or "Algorithm".
Open Source Code Yes Our code is available at https://github.com/cfn420/Co NNect.
Open Datasets Yes We train the model on the Cora dataset (Sen et al., 2008), a graph-based dataset... Res Net-56 (He et al., 2016) and VGG-19 (Simonyan & Zisserman, 2015), which are pre-trained and fine-tuned on CIFAR-10 and CIFAR100 datasets (Krizhevsky, 2009)... We use 10 randomly selected samples from Bookcorpus (Zhu et al., 2015)... During fine-tuning, we utilize Alpaca (Taori et al., 2023)... we conduct a zero-shot perplexity analysis on Wiki Text2 (Merity et al., 2022) and PTB (Marcus et al., 1993).
Dataset Splits No The paper mentions using a "calibration set" of 10 samples from Bookcorpus and performing "zero-shot perplexity analysis" on several benchmarks, which implies predefined testing without explicit train/validation splits by the authors. For datasets like Cora, CIFAR-10, CIFAR-100, and Alpaca, it mentions training or fine-tuning but does not provide explicit percentages or sample counts for training, validation, and testing splits, nor does it cite specific predefined splits with authors and year for reproducibility of data partitioning.
Hardware Specification Yes All experiments were performed on a single NVIDIA RTX4090 GPU with 24GB of memory.
Software Dependencies No The paper mentions the use of Adam as an optimizer, but it does not specify any programming languages or software libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes All models were trained for 200 epochs using Adam with a learning rate of 0.01, a cosine annealing scheduler, and batch size 256. After training, we pruned 96% of the weights in each layer... Finally, the model is fine-tuned with the same hyperparameters but with a decreased initial learning rate of 0.001 for 50 epochs.