reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Understanding the Unfairness in Network Quantization

Authors: Bing Liu, Wenjun Miao, Boyu Zhang, Qiankun Zhang, Bin Yuan, Jing Wang, Shenghao Liu, Xianjun Deng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Theoretical analysis with empirical verifications reveals two responsible factors, as well as how they influence a metric of fairness in depth. A comparison between PTQ and QAT is then made, explaining an observation that QAT behaves even worse than PTQ in fairness, although it often preserves a higher accuracy at lower bit-widths in quantization. Finally, the paper finds out that several simple data augmentation methods can be adopted to alleviate the disparate impacts of quantization, based on a further observation that class imbalance produces distinct values of the aforementioned factors among different attribute classes. We experiment on either imbalanced (UTK-Face and FER2013) or balanced (CIFAR10 and MNIST) datasets using Res Net and VGG models for empirical evaluation.
Researcher Affiliation	Academia	1School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan, China 2Key Laboratory of Cyberspace Security, Ministry of Education, Zhengzhou, China 3Hubei Key Laboratory of Distributed System Security, Wuhan, China 4Songshan Laboratory, Zhengzhou, China 5Visiting researcher with the Lion Rock Labs of Cyberspace Security, CTl HE, Hong Kong, China 6School of Software Engineering, Huazhong University of Science and Technology, Wuhan, China. Correspondence to: Qiankun Zhang <EMAIL>.
Pseudocode	No	The paper contains mathematical equations, theorems, and proofs but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the described methodology, nor does it include any links to code repositories.
Open Datasets	Yes	We experiment on either imbalanced (UTK-Face and FER2013) or balanced (CIFAR10 and MNIST) datasets using Res Net and VGG models for empirical evaluation. ... UTK-Face dataset (Zhang et al., 2017) ... FER2013(Goodfellow et al., 2013) ... CIFAR-10 (Krizhevsky et al., 2010) ... MNIST (Deng, 2012).
Dataset Splits	Yes	Table 4. Datasets used in our experiments. Dataset Description Training Set Test Set Labels UTK-Face Face image 18,964 4,741 Age, gender, ethnicity FER2013 Facial expression image 28,708 7,178 Seven facial expressions CIFAR-10 RGB image 50,000 10,000 Ten object classes Imbalanced-CIFAR-10 RGB image 19,375 10,000 Ten object classes MNIST Handwritten-digits image 60,000 10,000 0-9 Imbalanced-MNIST Handwritten-digits image 23,782 10,000 0-9
Hardware Specification	Yes	The training process is performed using an NVIDIA 3090Ti device.
Software Dependencies	Yes	All experiments are conducted in a Python 3.10 environment using the Py Torch framework.
Experiment Setup	Yes	The hyperparameters for all the models are set with an initial learning rate of 0.001, which is gradually reduced based on the number of epochs during training to optimize the models. The VGG19 model is trained for epochs ranging from 40 to 60, the Res Net18 model also undergoes training for 100 epochs, while the Res Net50 model is trained for approximately 200 epochs.