reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition

Authors: Ismail Alkhouri, Xitong Zhang, Rongrong Wang

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results (i) demonstrate the effectiveness of our approach using MNIST on Fully Connected Networks, CIFAR10 on Vision Transformers, and CIFAR10/100 and Image Net on Convolutional Neural Networks, and (ii) illustrate that we achieve either competitive or state-of-the-art results when compared to leading structured pruning and low-rank training methods in terms of FLOPs and parameters drop.
Researcher Affiliation	Academia	Ismail R. Alkhouri EMAIL; EMAIL Department of Computational Mathematics, Science & Engineering Michigan State University Department of Electrical Engineering & Computer Science University of Michigan Ann Arbor Xitong Zhang EMAIL Department of Computational Mathematics, Science & Engineering Michigan State University Rongrong Wang EMAIL Department of Computational Mathematics, Science & Engineering Department of Mathematics Michigan State University
Pseudocode	Yes	Algorithm 1 Compression with Lo RITa+SVT. Input: L trainable weights Wi, i [L], factorization parameter N > 1, and singular value truncation parameter r. Output: Compressed and trained Weights.
Open Source Code	Yes	Our code is available at https://github.com/Xitong System/Lo RITa/tree/main.
Open Datasets	Yes	Our experimental results (i) demonstrate the effectiveness of our approach using MNIST on Fully Connected Networks, CIFAR10 on Vision Transformers, and CIFAR10/100 and Image Net on Convolutional Neural Networks...
Dataset Splits	No	The paper mentions using well-known datasets like MNIST, CIFAR10, CIFAR100, and Image Net but does not explicitly provide specific training/validation/test splits, percentages, or methodology for reproducing the data partitioning for the main experiments. It mentions '120 randomly subsampled training data to compute E(l)' in Appendix A, but this is for an internal iterative process, not the overall dataset split for model training and evaluation.
Hardware Specification	No	The paper states, 'We use Py Torch to conduct our experiments,' but does not provide any specific details regarding the hardware used for these experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions 'We use Py Torch to conduct our experiments,' but it does not specify the version number of PyTorch or any other software dependencies required to reproduce the experimental setup.
Experiment Setup	Yes	First, we evaluate our proposed method on fully connected neural networks, varying the number of layers, utilizing the Adam optimizer with a learning rate set to 1 10 2, and employing a constant layer dimension of 96 (other than the last). Overparameterization is applied across all layers in the model. To ensure a fair comparison, we begin by tuning the baseline model (N = 1) across a range of weight decay parameters {5 10 6, 1 10 5, 2 10 4, 5 10 5, 1 10 4, 2 10 4}. Subsequently, we extend our exploration of weight decay within the same parameter range for models with N > 1. ... The learning rate applied in this evaluation is set to 3 10 4. The weight decay was searched over {1 10 2, 5 10 3, 1 10 3} for CIFAR10 and {1 10 5, 5 10 5, 1 10 4} for CIFAR100. ... All the considered Vi T models underwent optimization via the Adam optimizer with a learning rate of 3 10 4. The hidden dimension is 256 for all Vi Ts. ... we initially fine-tuned the baseline model (N = 1) across the following weight decay parameters {5 10 5, 1 10 4, 2 10 4, 5 10 4, 1 10 3, 2 10 3}.