reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Inverse Factorized Soft Q-Learning for Cooperative Multi-agent Imitation Learning

Authors: The Viet Bui, Tien Mai, Thanh Nguyen

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present extensive experiments conducted on some challenging multi-agent game environments, including an advanced version of the Star-Craft multi-agent challenge (SMACv2), which demonstrates the effectiveness of our algorithm.
Researcher Affiliation	Academia	The Viet Bui Singapore Management University, Singapore EMAIL Tien Mai Singapore Management University, Singapore EMAIL Thanh Hong Nguyen University of Oregon Eugene, Oregon, United States EMAIL
Pseudocode	Yes	B.1 MIFQ Algorithm The detailed steps of our MIFQ algorithm are shown in Algo. 1 below: Algorithm 1: Multi-agent Inverse Factorized Q-Learning
Open Source Code	Yes	We also uploaded our source code for re-productivity purposes. Our source code is submitted alongside the paper, accompanied by sufficient instructions. We will share the code publicly for re-producibility or benchmarking purposes.
Open Datasets	Yes	Finally, we conduct extensive experiments in three domains: SMACv2 [9], Gold Miner [12], and MPE (Multi Particle Environments) [25].
Dataset Splits	No	The paper refers to using expert trajectories for imitation learning and replay buffers for training, but does not specify explicit train/validation/test dataset splits with percentages or counts for the expert demonstrations.
Hardware Specification	Yes	We use four High-Performance Computing (HPC) clusters for training and evaluating all tasks. Specifically, each HPC cluster has a workload with an NVIDIA L40 GPU 48 GB GDDR6, 32 Intel-CPU cores, and 100GB RAM.
Software Dependencies	No	The paper provides general hyperparameters in Table 2 but does not list specific software dependencies (e.g., programming languages, libraries, or frameworks) with version numbers.
Experiment Setup	Yes	Table 2: Hyper-parameters. Arguments MPEs Miner SMACv2 Max training steps 100000 1000000 Evaluate times 32 Buffer size 100000 5000 Learning rate 2e-5 5e-4 Batch size 128 Hidden dim 256 Gamma 0.99 Target update frequency 4 Number of random seeds 4