reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Finite-Time Analysis of Decentralized Single-Timescale Actor-Critic

Authors: qijun luo, Xiao Li

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct experiments to show the superiority of our algorithm over the existing decentralized AC algorithms.
Researcher Affiliation	Academia	Qijun Luo EMAIL School of Science and Engineering Shenzhen Research Institute of Big Data (SRIBD) The Chinese University of Hong Kong, Shenzhen Shenzhen, China Xiao Li EMAIL School of Data Science Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS) The Chinese University of Hong Kong, Shenzhen Shenzhen, China
Pseudocode	Yes	Algorithm 1: Decentralized single-timescale AC (reward estimator version) Algorithm 2: Decentralized single-timescale AC (noisy reward version) Algorithm 3: Decentralized single-timescale NAC
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets	No	We adopt the grounded communication environment proposed in (Mordatch & Abbeel, 2018). Our task consists of N agents and the corresponding N landmarks inhabited in a two-dimension world, where each agent can observe the relative position of other agents and landmarks.
Dataset Splits	No	The paper describes the experimental environment and task setup but does not specify any training/test/validation dataset splits. The experiments appear to be conducted within a simulated environment rather than on a pre-defined dataset with splits.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	For "SDAC-re" and "SDAC-noi", we set αk = 0.01(k + 1) 0.5, βk = 0.1(k + 1) 0.5, ηk = 0.1(k + 1) 0.5, Kc = 5, σ = 0.5, Kr = 2. For "DLDAC", we fix Tc = 50, T c = 10, T = 5, Nc = 10, N = 100, σ = 0.1 3, which is adopted by their paper (see comparisons under different hyper-parameters in Appendix A). We set α = 0.01, β = 0.1 for "DLDAC" since we observe that larger step sizes will result in divergence. We set αk = 0.01(k + 1) 0.5, βk = 0.1(k + 1) 0.5, Kr = 2, σ = 0.5 and examine the consensus periods Kc of 1, 5, 10, and 20, respectively.