reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training

Authors: Zhanpeng Zhou, Mingze Wang, Yuchen Mao, Bingrui Li, Junchi Yan

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments are conducted using SAM in Equation (3), whereas our theoretical analyses in Section 4 apply the simplified SAM in Equation (4). Specifically, we perform experiments on the commonly used image classification datasets CIFAR-10/100 (Krizhevsky et al., 2009), with standard architectures such as Wide Res Net (Zagoruyko & Komodakis, 2016), Res Net (He et al., 2016), and VGG (Simonyan & Zisserman, 2015).
Researcher Affiliation	Academia	1Sch. of Computer Science & Sch. of Artificial Intelligence, Shanghai Jiao Tong University 2Peking University 3Tsinghua University 4Shanghai Artificial Intelligence Laboratory EMAIL
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided. The paper describes methods through mathematical equations and text.
Open Source Code	Yes	We released our source code at https://github.com/zzp1012/SAM-in-Late-Phase.
Open Datasets	Yes	We perform experiments on the commonly used image classification datasets CIFAR-10/100 (Krizhevsky et al., 2009)
Dataset Splits	Yes	We perform experiments on the commonly used image classification datasets CIFAR-10/100 (Krizhevsky et al., 2009), with standard architectures such as Wide Res Net (Zagoruyko & Komodakis, 2016), Res Net (He et al., 2016), and VGG (Simonyan & Zisserman, 2015). We use the standard configurations for the basic training settings shared by SAM and SGD (e.g., learning rate, batch size, and data augmentation) as in the original papers
Hardware Specification	No	No specific hardware details (like GPU/CPU models or processor types) are mentioned in the paper.
Software Dependencies	No	The paper mentions 'an implementation limitation in Py Torch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We use the standard configurations for the basic training settings shared by SAM and SGD (e.g., learning rate, batch size, and data augmentation) as in the original papers , and set the SAM-specific perturbation radius ρ to 0.05, as recommended by Foret et al. (2021). ... A weight decay of 5 * 10^-4 is applied, and the momentum for gradient update is set to 0.9. The learning rate is initialized at 0.1 and is dropped by 10 times at epoch 807. The total number of training epochs is 160.