reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Hyperparameters in Stochastic Gradient Descent with Momentum

Authors: Bin Shi

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An experimental comparison between SGD and SGD with momentum is shown in Figure 1. Figure 1: The comparison for the training error between SGD and SGD with momentum. The setting is a 20-layer convolutional neural network on CIFAR-10 (Krizhevsky, 2009) with a mini-batch size of 128. Learning Rate: s = 0.01. Momentum Coeﬃcient: α = 0.9.
Researcher Affiliation	Academia	Bin Shi EMAIL Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing, 100190, China School of Mathematical Science University of Chinese Academy of Sciences Beijing 100049, China
Pseudocode	No	The paper describes algorithms like SGD with momentum and Nesterov momentum using mathematical equations (e.g., xk+1 = xk s f(xk) + sξk + α(xk xk 1)) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. There are no explicit statements about code release, repository links, or code in supplementary materials.
Open Datasets	Yes	The setting is a 20-layer convolutional neural network on CIFAR-10 (Krizhevsky, 2009) with a mini-batch size of 128.
Dataset Splits	No	The paper mentions using 'CIFAR-10' and a 'mini-batch size of 128' but does not explicitly provide training/test/validation dataset splits or refer to predefined splits with citations for reproducibility of data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. It only mentions 'a 20-layer convolutional neural network' which implies computation but no hardware specifics.
Software Dependencies	No	The paper discusses the theoretical analysis of optimization algorithms and mentions deep learning in the context of experiments, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow) that were used to implement the methods or run the experiments.
Experiment Setup	Yes	The setting is a 20-layer convolutional neural network on CIFAR-10 (Krizhevsky, 2009) with a mini-batch size of 128. Learning Rate: s = 0.01. Momentum Coeﬃcient: α = 0.9.