reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Provably efficient multi-task reinforcement learning with model transfer

Authors: Chicheng Zhang, Zhi Wang

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, with a goal of improving their collective performance through inter-player information sharing. We design and analyze an algorithm based on the idea of model transfer, and provide gap-dependent and gap-independent upper and lower bounds that characterize the intrinsic complexity of the problem.
Researcher Affiliation	Academia	Chicheng Zhang University of Arizona EMAIL Zhi Wang University of California San Diego EMAIL
Pseudocode	Yes	Algorithm 1: MULTI-TASK-EULER
Open Source Code	No	No explicit statement or link for open-source code for the described methodology was found.
Open Datasets	No	As a theoretical paper, no specific dataset is used for training or evaluation. The paper describes a problem setting in tabular episodic Markov decision processes (MDPs) rather than using a concrete dataset.
Dataset Splits	No	As a theoretical paper, there is no mention of train/validation/test dataset splits.
Hardware Specification	No	As a theoretical paper, no hardware specifications for running experiments are mentioned.
Software Dependencies	No	As a theoretical paper, no specific software dependencies with version numbers for experimental replication are mentioned.
Experiment Setup	No	As a theoretical paper, no specific experimental setup details such as hyperparameters or training configurations are provided.