Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Anticipatory Fictitious Play
Authors: Alex Cloud, Albert Wang, Wesley Kerr
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an extensive comparison of our algorithm with fictitious play, proving an optimal O(t 1) convergence rate for certain classes of games, demonstrating superior performance numerically across a variety of games, and concluding with experiments that extend these algorithms to the setting of deep multiagent reinforcement learning. |
| Researcher Affiliation | Industry | Alex Cloud , Albert Wang and Wesley Kerr Riot Games EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Neu PL-FP/AFP |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing code for the work described, nor does it provide a direct link to such source code. |
| Open Datasets | Yes | We use two environments, our own Tiny Fighter, and Running With Scissors, from [Vezhnevets et al., 2020]. [Vezhnevets et al., 2020] and [Liu et al., 2022] (Appendix B.1.) feature further discussion of the environment. |
| Dataset Splits | No | The paper describes the training process and evaluation metrics but does not provide specific details on dataset split percentages or counts for training, validation, and testing. |
| Hardware Specification | No | For reinforcement learning, we use the Asynchronous Proximal Policy Optimization (APPO) algorithm [Schulman et al., 2017], a distributed actor-critic RL algorithm, as implemented in RLLib [Moritz et al., 2018] with a single GPU learner. |
| Software Dependencies | No | For reinforcement learning, we use the Asynchronous Proximal Policy Optimization (APPO) algorithm [Schulman et al., 2017], a distributed actor-critic RL algorithm, as implemented in RLLib [Moritz et al., 2018] with a single GPU learner. |
| Experiment Setup | No | Hyperparameter settings are given in the supplementary material, Section E. |