reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints

Authors: Tianhao Wang, Dongruo Zhou, Quanquan Gu

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 6 we present the numerical experiment which supports our theory.
Researcher Affiliation	Academia	Tianhao Wang Department of Statistics and Data Science Yale University New Haven, CT 06511 EMAIL Dongruo Zhou Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 EMAIL Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 EMAIL
Pseudocode	Yes	Algorithm 1 LSVI-UCB-Batch
Open Source Code	No	(a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets	No	We run our algorithms, LSVI-UCB-Batch and LSVI-UCB-Rare Switch, on a synthetic linear MDP given in Example 6.1, and compare them with the fully adaptive baseline, LSVI-UCB (Jin et al., 2020).
Dataset Splits	No	The paper uses a synthetic MDP and evaluates performance using regret over episodes; it does not describe dataset splits like training, validation, or test sets.
Hardware Specification	Yes	All experiments are performed on a PC with Intel i7-9700K CPU.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers.
Experiment Setup	Yes	In our experiment, we set H = 10, K = 2500, δ = 0.35 and d = 13, thus A contains 1024 actions. [...] In detail, for LSVI-UCB-Batch, we run the algorithm for B = 10, 20, 30, 40, 50 respectively; for LSVI-UCB-Rare Switch, we set η = 2, 4, 8, 16, 32.