reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Closing the gap between SVRG and TD-SVRG with Gradient Splitting

Authors: Arsenii Mustafin, Alex Olshevsky, Ioannis Paschalidis

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical findings are supported by a set of experiments. We conducted experimental studies demonstrating that our theoretically derived batch size and learning rate achieve geometric convergence and outperform other algorithms that rely on parameters selected via grid search, as detailed in Section 6 and Appendix J.
Researcher Affiliation	Academia	Arsenii Mustafin EMAIL Department of Computer Science Boston University Alex Olshevsky EMAIL Department of Electrical and Computer Engineering Boston University Ioannis Ch. Paschalidis EMAIL Department of Electrical and Computer Engineering Boston University
Pseudocode	Yes	Algorithm 1 TD-SVRG for the finite sample case
Open Source Code	Yes	Instructions and code for reproducing the experiments can be found in our github repository.
Open Datasets	Yes	Databases of size 5,000 are generated from 4 environments: Random MDP (Dann et al., 2014), and the Acrobot, Cart Pole and Mountain car Open AI Gym environments (Brockman et al., 2016).
Dataset Splits	No	The paper discusses how samples are drawn during the algorithm run (e.g., "first, a trajectory of length N is drawn from an MDP following Markovian sampling and forms dataset D = {(st, at, rt, st+1)}N t=1. Then TD(0) proceeds by drawing samples from this dataset"), but it does not provide specific predefined training/test/validation splits for a fixed dataset for overall model evaluation reproducibility.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts) are mentioned in the paper for running experiments.
Software Dependencies	No	No specific software dependencies with version numbers are provided in the paper. It mentions Open AI Gym and DQN algorithm but without version details.
Experiment Setup	Yes	Hyperparamters for the algorithms are selected as follows: for TD-SVRG our theoretically justified parameters are selected, the learning rate is set to α = 1/8 and the update batch size to M = 16/λA; for GTD2 the best performing parameters were: α = 0.125 and β = 0.25; for vanilla TD a decreasing learning rate is set to α = 1/√t; for PD-SVRG the parameters are set to σθ = 0.1/(Lρκ( ˆC)), σw = 0.1/λmax(C) and the batch size is twice the size of the dataset, i.e., M = 2N. Each algorithm for each setting was run 10 times and the geometric average performance is presented.