Evolutionary Dynamics of Multi-Agent Learning: A Survey

Authors: Daan Bloembergen, Karl Tuyls, Daniel Hennes, Michael Kaisers

JAIR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We established the link between multi-agent reinforcement and the replicator dynamics of evolutionary game theory in Section 3, and provided an overview of learning dynamics in normal-form games, continuous strategy spaces, and stochastic (Markov) games in Section 4. Here, we show a set of experiments that empirically validate these models in two-player two-action normal-form games. We restrict ourselves to these games as their simplicity allows easy visual analysis, while preserving the explanatory power of such dynamical models. At the end of this section we provide an overview of related empirical work in more complex interactions.
Researcher Affiliation Collaboration Daan Bloembergen EMAIL Karl Tuyls EMAIL Department of Computer Science, University of Liverpool Ashton Building, Ashton Street, Liverpool L69 3BX, UK Daniel Hennes EMAIL Advanced Concepts Team, European Space Agency Keplerlaan 1, 2201 AZ Noordwijk, NL Michael Kaisers EMAIL Centrum Wiskunde & Informatica Science Park 123, 1098 XG Amsterdam, NL
Pseudocode No The paper describes algorithms and dynamics using mathematical equations and textual explanations, but does not include any clearly labeled pseudocode blocks or algorithm listings.
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include links to code repositories.
Open Datasets No The paper discusses various game theory scenarios (e.g., Prisoner's Dilemma, Stag Hunt, Matching Pennies, Battle of the Sexes) and their dynamics, as well as applications in stock markets and multi-robot systems. However, it does not explicitly state the use of specific publicly available datasets for experiments with concrete access information. The 'Heuristic Payoff Tables' section describes a method for approximating payoffs but does not refer to public datasets.
Dataset Splits No The paper describes experiments validating dynamical models in two-player two-action normal-form games and applies meta-strategies to scenarios like stock markets and multi-robot systems. However, it does not provide specific details on dataset splits (e.g., training/test/validation percentages or counts) for any empirical evaluations mentioned.
Hardware Specification No The paper conducts empirical validation of models and simulations (e.g., in Section 5 and 6), but it does not specify any hardware details such as CPU models, GPU models, or memory specifications used for these experiments.
Software Dependencies No The paper discusses various algorithms like Q-learning, Cross learning, IGA, FAQ, and LFAQ, but it does not specify any software names with version numbers (e.g., Python, PyTorch, libraries) that would be needed to replicate the experimental results.
Experiment Setup No The paper discusses the theoretical models and their empirical validation, including parameters like learning rates (α) and temperature (τ), but it does not provide a comprehensive or specific experimental setup including concrete hyperparameter values or system-level training configurations in a dedicated section or table within the main text.