reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MCNC: Manifold-Constrained Reparameterization for Neural Compression

Authors: Chayne Thrash, Reed Andreas, Ali Abbasi, Parsa Nooralinejad, Soroush Abbasi Koohpayegani, Hamed Pirsiavash, Soheil Kolouri

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments in computer vision and natural language processing tasks, we demonstrate that our method significantly outperforms state-of-the-art baselines in terms of compression, accuracy, and/or model reconstruction time.
Researcher Affiliation	Academia	1Department of Computer Science, Vanderbilt University, Nashville, TN, 37235 2Department of Computer Science, University of California, Davis, CA, 95616 chayne.thrash,ali.abbasi,reed.w.andreas,EMAIL pnoorali,soroush,EMAIL
Pseudocode	Yes	A.1 EXAMPLE CODE FOR APPLYING MCNC Below, we include Py Torch code which shows how to create a linear layer which is reparameterized using MCNC. 1 import torch 2 import torch.nn as nn 3 import torch.nn.functional as F 4 from math import ceil 6 class MCNC_Linear(nn.Linear):
Open Source Code	Yes	Our code is publicly available at https://github.com/mint-vu/MCNC.
Open Datasets	Yes	We begin by evaluating on training Vision Transformers from scratch on the Image Net-100 dataset (Tian et al., 2020). CIFAR-10 and CIFAR-100. We compare MCNC on CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) We conducted fine-tuning experiments on two variants of the LLa MA-2 (Touvron et al., 2023) language model, LLa MA 7B, and LLa MA 13B, using the Alpaca dataset as our training data (Taori et al., 2023).
Dataset Splits	Yes	We begin by evaluating on training Vision Transformers from scratch on the Image Net-100 dataset (Tian et al., 2020). CIFAR-10 and CIFAR-100. We compare MCNC on CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) We conducted fine-tuning experiments on two variants of the LLa MA-2 (Touvron et al., 2023) language model, LLa MA 7B, and LLa MA 13B, using the Alpaca dataset as our training data (Taori et al., 2023). We report both training loss and validation loss on the Alpaca dataset.
Hardware Specification	Yes	We use a single RTX 3090 GPU to calculate the throughput. We perform this experiment 100 times using an RTX A6000 and report the average timings in Table 8.
Software Dependencies	No	We use the training code from QLo RA (Dettmers et al., 2023) and NOLA (Koohpayegani et al., 2024) for our experiments.
Experiment Setup	Yes	We provide hyperparameters used for our method and baselines in section A.3. Implementation Details: We use the training code from QLo RA (Dettmers et al., 2023) and NOLA (Koohpayegani et al., 2024) for our experiments. We quantize the original parameters of the language model to 4-bit and apply and fine-tune the adapter on all layers of the transformer. For NOLA, we followed the hyperparameters reported in (Koohpayegani et al., 2024). We set the rank to 8 for both NOLA and our method in LLa MA 7B, and to 16 in LLa MA 13B. In our method, we use a generator with the following specifications: a 3-layer MLP with an input dimension of 5, an output dimension of 5000, and a hidden dimension of 32. In NOLA, we use 64 bases for A and B in LLa MA 7B and 140 bases for LLa MA 13B. These numbers of bases result in the same number of optimized parameters as our method for each architecture. All other hyperparameters were identical for both methods except the learning rate; we used lr = 0.001 for NOLA and 0.01 for ours.