reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Faster Double Adaptive Gradient Methods

Authors: Feihu Huang, Yuning Luo

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct some numerical experiments to verify efficiency of our proposed methods. In this section, we conduct some experiments on image classification and language modeling tasks to verify efficiency of our proposed methods.
Researcher Affiliation	Academia	1College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China 2MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing, China
Pseudocode	Yes	Algorithm 1: Double Adaptive SGD (2Ada SGD) Algorithm; Algorithm 2: Double Adaptive SPIDER (2Ada SPIDER) Algorithm
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	In the experiment, we conduct image classification task on CIFAR-10 (Krizhevsky, Hinton et al. 2009) and Imagenet (Deng et al. 2009) datasets, respectively. In the experiment, we conduct language modeling task on the Penn-Treebank (Marcus, Santorini, and Marcinkiewicz 1993) and Wiki Text2 (Merity et al. 2016) datasets, respectively.
Dataset Splits	Yes	In the experiment, we conduct image classification task on CIFAR-10 (Krizhevsky, Hinton et al. 2009) and Imagenet (Deng et al. 2009) datasets, respectively. Specifically, we train a 3-layer Convolutional Neural Network (CNN) on the CIFAR-10 dataset and train the Res Net18 (He et al. 2016) on the Imagenet dataset. Specifically, we will train a 2-layer LSTM (Hochreiter and Schmidhuber 1997) on the Penn-Treebank dataset and train a 2-layer Transformer (Vaswani 2017) on the Wiki Text2 dataset.
Hardware Specification	Yes	All experiments are run over a machine with Intel(R) Xeon(R) Platinum 8352V CPU and 1 Nvidia RTX 4090 GPU.
Software Dependencies	No	The paper mentions neural network models like CNN, ResNet18, LSTM, and Transformer, but does not specify the version numbers of any software libraries, frameworks, or programming languages used (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup	Yes	For the learning rates and other hyper-parameters, we do grid search and report the best one for each optimizer. We set γ = 10 3, m = 50 in our 2Ada SGD algorithm, and set γ = 10 2, b = 64 in our 2Ada SPIDER algorithm. In other algorithms, we set the basic learning rate as 0.001, and the basic batchsize as 64. Here the neural network architecture of the 3-layer CNN is provided in Table 2.