reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Authors: Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin Yang

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we instantiate our framework on a class of hard exploration problems to demonstrate the practicality of our theory.
Researcher Affiliation	Academia	Fei Feng University of California, Los Angeles EMAIL Ruosong Wang Carnegie Mellon University EMAIL Wotao Yin University of California, Los Angeles EMAIL Simon S. Du University of Washington EMAIL Lin F. Yang University of California, Los Angeles EMAIL
Pseudocode	Yes	Algorithm 1 A Uniﬁed Framework for Unsupervised RL; Algorithm 2 Trajectory Sampling Routine TSR (ULO, π, B); Algorithm 3 Fix Label( f[H+1], Z)
Open Source Code	Yes	Our code is available at https://github.com/Florence Feng/State Decoding.
Open Datasets	No	We conduct experiments in two environments: Lock Bernoulli and Lock Gaussian. These environments are also studied in Du et al. (2019a), which are designed to be hard for exploration.
Dataset Splits	No	The paper describes custom-built environments (Lock Bernoulli and Lock Gaussian) for which data is generated episodically, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) or provide a method to reproduce such splits from a static dataset.
Hardware Specification	No	No specific hardware details such as GPU models, CPU models, or memory specifications used for running experiments are provided in the paper.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) are mentioned in the paper.
Experiment Setup	No	The paper states: 'Details about hyperparameters and unsupervised learning oracles in URL can be found in Appendix C.', thus deferring the specific experimental setup details to a supplemental appendix rather than providing them in the main text.