reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Authors: Kaiqing Zhang, Sham M. Kakade, Tamer Basar, Lin F. Yang

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we aim to address the fundamental question about its sample complexity. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model. We show that model-based MARL achieves a sample complexity of e O(\|S\|\|A\|\|B\|(1 γ) 3ϵ 2) for ﬁnding the Nash equilibrium (NE) value up to some ϵ error... Our results not only illustrate the sample-eﬃciency of this basic model-based MARL approach, but also elaborate on the fundamental tradeoﬀbetween its power (easily handling the reward-agnostic case) and limitation (less adaptive and suboptimal in \|A\|, \|B\|), which particularly arises in the multi-agent context.
Researcher Affiliation	Academia	Kaiqing Zhang EMAIL University of Maryland, College Park College Park, MD 20740, USA Sham M. Kakade EMAIL Harvard University Cambridge, MA 02138, USA Tamer Ba sar EMAIL University of Illinois at Urbana-Champaign Urbana, IL 61801, USA Lin F. Yang EMAIL University of California, Los Angeles Los Angeles, CA 90095, USA
Pseudocode	No	The paper describes theoretical results, including theorems and proofs, and discusses existing planning algorithms such as value iteration and policy iteration, but it does not present any structured pseudocode or algorithm blocks for its proposed methods.
Open Source Code	No	The paper discusses theoretical sample complexity, lower bounds, and proofs for model-based multi-agent reinforcement learning. It does not contain any statements about open-sourcing code, links to code repositories, or mention of code in supplementary materials for the described methodology.
Open Datasets	No	We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model. This generative model allows agents to sample the MG, and query the next state from the transition process, given any state-action pair as input. The paper focuses on theoretical analysis using this generative model concept, not on experiments with specific open datasets.
Dataset Splits	No	The paper is theoretical and does not present experimental results based on specific datasets. Therefore, it does not include information about training, test, or validation dataset splits.
Hardware Specification	No	The paper presents a theoretical analysis of model-based multi-agent reinforcement learning, including proofs and lower bounds. It does not describe any experiments or specify hardware used for computations.
Software Dependencies	No	The paper focuses on theoretical contributions and does not describe any practical implementation or experiments that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper provides a theoretical analysis of model-based MARL and does not include any experimental results, hence there are no details regarding hyperparameter values, training configurations, or other elements of an experimental setup.