reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Quantization Aware Factorization for Deep Neural Network Compression

Authors: Daria Cherniuk, Stanislav Abukhovich, Anh-Huy Phan, Ivan Oseledets, Andrzej Cichocki, Julia Gusak

JAIR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compress neural network weights with a devised algorithm and evaluate it s prediction quality and performance. We compare our approach to state-of-the-art post-training quantization methods and demonstrate competitive results and high flexibility in achiving a desirable quality-performance tradeoff.
Researcher Affiliation	Academia	Daria Cherniuk EMAIL Stanislav Abukhovich EMAIL Anh-Huy Phan EMAIL Ivan Oseledets EMAIL Andrzej Cichocki EMAIL Skolkovo Institute of Science and Technology, Russia Julia Gusak EMAIL Inria, University of Bordeaux, France
Pseudocode	Yes	Algorithm 1 Solve 9 using ADMM Algorithm 2 Solve (15) using ADMM-EPC
Open Source Code	Yes	Our source code and experiments are available as an open github repository1. 1. https://github.com/Kamikazi Zen/admm-quantization
Open Datasets	Yes	We use pretrained Res Net18 model from torchvision model zoo2 and Image Net3 dataset to evaluate our method. 2. https://pytorch.org/vision/stable/models.html 3. https://www.image-net.org/
Dataset Splits	Yes	Batch Norm calibration. Since Resnet models have a lot of Batch Norm layers, and both factorization and quantization disturb the distribution of outputs, calibrating Batch Norm layers parameters (performing inference with no gradients or activations accumulation) can boost model prediction quality. The same procedure was used, for example, in Ada Quant(Hubara et al., 2020). In all our experiments we perform calibration on 2048 samples from training dataset. It is the same number of samples as used in Ada Round(Nagel et al., 2020) and Ada Quant.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions target devices (mobile or embedded) but not the experimental setup hardware.
Software Dependencies	No	The paper mentions using a "pretrained Res Net18 model from torchvision model zoo" which implies PyTorch and torchvision are used, but specific version numbers for these or any other software dependencies are not provided.
Experiment Setup	Yes	Metrics. In some ablation studies we compute a quantized reconstruction error metric: equant = X(2) ˆB( ˆC ˆA)T F X(2) F (17) where ˆA, ˆB, ˆC are quantized factors of tensor X. X(2) is a matrix unfolding by mode 2. We could use any other unfolding, the purpose of this notation is merely to address the fact that Frobenius norm is formulated for matrices. Taking into consideration benefits of reducing both MAC operations (factorization) and operand s bit-width (quantization), we adopt BOP metric, introduced in (van Baalen et al., 2020) and use it to compare our method with other approaches. For layer l, BOP count is computed as follows: BOPs(l) = MACs(l)bwba (18) where bw is a bit-width of weights and ba is a bit-width of (input) activations. While computing this metric for the whole model we take into consideration all layers including Batch Norms. Factorization ranks. To choose a rank for each layer factorization, we set up a parameter reduction rate (i.e. the ratio of the number of the factorized layer s parameters over that of the original layer) and define the rank as N/(n + m)/rate for fully-connected layers and N/(n + m + k)/rate for convolutions. N is the number of parameters in the original layer, n, m and k dimensions of reshaped convolution weights (Section 2.1 ), rate denotes parameter reduction rate. Batch Norm calibration. Since Resnet models have a lot of Batch Norm layers, and both factorization and quantization disturb the distribution of outputs, calibrating Batch Norm layers parameters (performing inference with no gradients or activations accumulation) can boost model prediction quality. The same procedure was used, for example, in Ada Quant(Hubara et al., 2020). In all our experiments we perform calibration on 2048 samples from training dataset.