reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Trajectory-Class-Aware Multi-Agent Reinforcement Learning

Authors: Hyungho Na, Kwanghyeon Lee, Sumin Lee, Il-chul Moon

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed method is evaluated on various tasks, including multi-task problems built upon Star Craft II. Empirical results show further performance improvements over state-of-the-art baselines.
Researcher Affiliation	Collaboration	1Korea Advanced Institute of Science and Technology (KAIST), 2summary.ai {gudgh723}@gmail.com,EMAIL
Pseudocode	Yes	Algorithm 1 Compute J (t, k) ... Algorithm 2 Training algorithm for TRAMA
Open Source Code	Yes	Our official code is available at: https://github.com/aailab-kaist/TRAMA.
Open Datasets	Yes	In this section, we evaluate TRAMA through multi-task problems built upon SMACv2 (Ellis et al., 2024) and conventional MARL benchmark problems (Samvelyan et al., 2019; Ellis et al., 2024).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits with percentages or counts for a pre-collected dataset. It refers to environments like SMACv2 which generate data during interaction, and distinguishes between in-distribution and out-of-distribution tasks for evaluation.
Hardware Specification	Yes	For experiments, we mainly use Ge Force RTX 3090 and Ge Force RTX 4090 GPUs. ... Training times of all models are measured in Ge Force RTX 3090 or RTX 4090.
Software Dependencies	No	Our code is built on Py MARL (Samvelyan et al., 2019) and the open-sourced code from LAGMA (Na & Moon, 2024). The paper mentions software platforms but does not specify version numbers for PyMARL or any other libraries like Python, PyTorch, or CUDA.
Experiment Setup	Yes	For VQ-VAE training, we use the fixed hyperparameters for all tasks, such as λvq=0.25, λcommit=0.125, λcvr=0.125 in Eq. (5), nψ=500, and nvq freq=10. Here, nψ is the update interval for clustering and classifier learning, and nvq freq represents the update interval of VQ-VAE. ... Table 4: Hyperparameter settings for TRAMA experiments.