reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Revisiting Random Weight Perturbation for Efficiently Improving Generalization

Authors: Tao Li, Qinghua Tao, Weihao Yan, Yingwen Wu, Zehao Lei, Kun Fang, Mingzhen He, Xiaolin Huang

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experimental evaluations, we demonstrate that our enhanced RWP methods achieve greater efficiency in enhancing generalization, particularly in large-scale problems, while also offering comparable or even superior performance to SAM. The code is released at https://github.com/nblt/mARWP. In this section, we present extensive experimental results to demonstrate the efficiency and effectiveness of our proposed methods. We begin by introducing the experimental setup and then evaluate the performance over three standard benchmark datasets: CIFAR-10, CIFAR-100 and Image Net. We also conduct ablation studies on the hyper-parameters and visualize the loss landscape to provide further insights.
Researcher Affiliation	Academia	Tao Li EMAIL Shanghai Jiao Tong University Qinghua Tao EMAIL KU Leuven Weihao Yan EMAIL Shanghai Jiao Tong University Yingwen Wu EMAIL Shanghai Jiao Tong University Zehao Lei EMAIL Shanghai Jiao Tong University Kun Fang EMAIL Shanghai Jiao Tong University Mingzhen He EMAIL Shanghai Jiao Tong University Xiaolin Huang EMAIL Shanghai Jiao Tong University
Pseudocode	No	The paper describes methods and algorithms mathematically and in prose, but it does not contain any clearly labeled pseudocode or algorithm blocks. For example, Section 5 "Improving Random Weight Perturbation" details the proposed methods with mathematical formulations and textual explanations.
Open Source Code	Yes	The code is released at https://github.com/nblt/mARWP.
Open Datasets	Yes	We experiment over three benchmark image classification tasks: CIFAR-10, CIFAR100 (Krizhevsky & Hinton, 2009), and Image Net (Deng et al., 2009).
Dataset Splits	Yes	We experiment over three benchmark image classification tasks: CIFAR-10, CIFAR100 (Krizhevsky & Hinton, 2009), and Image Net (Deng et al., 2009). For CIFAR, we apply standard random horizontal flipping, cropping, normalization, and Cutout augmentation (De Vries & Taylor, 2017) (except for ViT, for which we use RandAugment (Cubuk et al., 2020)). For Image Net, we apply basic data preprocessing and augmentation following the public Pytorch example (Paszke et al., 2017). Mean and standard deviation are calculated over three independent trials.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications. It only describes training parameters like batch size, epochs, and learning rates. For example, in "Training Settings": "For CIFAR experiments, we set the training epochs to 200 with batch size 256, momentum 0.9, and weight decay 0.001..."
Software Dependencies	No	The paper mentions using a "public Pytorch example" and optimizers like "Adam (Kingma & Ba, 2015)", but it does not specify version numbers for any software libraries, frameworks, or environments used in the experiments.
Experiment Setup	Yes	For CIFAR experiments, we set the training epochs to 200 with batch size 256, momentum 0.9, and weight decay 0.001 (Du et al., 2022a; Zhao et al., 2022a), keeping the same among all methods for a fair comparison (except for ViT, we adopt a longer training schedule and provide the details in Appendix E). For SAM, we conduct a grid search for ρ over {0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5}... For RWP and ARWP, we search σ over {0.005, 0.01, 0.015, 0.02} and set σ = 0.01... For m-RWP and m-ARWP, we use σ = 0.015 and λ = 0.5... We set η = 0.1 and β = 0.99 as default choice. For Image Net experiments, we set the training epochs to 90 with batch size 256, weight decay 0.0001, and momentum 0.9. We use ρ = 0.05 for SAM... σ = 0.003 for RWP and ARWP, and σ = 0.005, λ = 0.5 for m-ARWP. We employ m-sharpness with m = 128 for SAM... For all experiments, we adopt cosine learning rate decay (Loshchilov & Hutter, 2016) with an initial learning rate of 0.1 and record the final model performance on the test set.