Trajectory-Class-Aware Multi-Agent Reinforcement Learning

Authors: Hyungho Na, Kwanghyeon Lee, Sumin Lee, Il-chul Moon

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The proposed method is evaluated on various tasks, including multi-task problems built upon Star Craft II. Empirical results show further performance improvements over state-of-the-art baselines.
Researcher Affiliation Collaboration 1Korea Advanced Institute of Science and Technology (KAIST), 2summary.ai {gudgh723}@gmail.com,EMAIL
Pseudocode Yes Algorithm 1 Compute J (t, k) ... Algorithm 2 Training algorithm for TRAMA
Open Source Code Yes Our official code is available at: https://github.com/aailab-kaist/TRAMA.
Open Datasets Yes In this section, we evaluate TRAMA through multi-task problems built upon SMACv2 (Ellis et al., 2024) and conventional MARL benchmark problems (Samvelyan et al., 2019; Ellis et al., 2024).
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits with percentages or counts for a pre-collected dataset. It refers to environments like SMACv2 which generate data during interaction, and distinguishes between in-distribution and out-of-distribution tasks for evaluation.
Hardware Specification Yes For experiments, we mainly use Ge Force RTX 3090 and Ge Force RTX 4090 GPUs. ... Training times of all models are measured in Ge Force RTX 3090 or RTX 4090.
Software Dependencies No Our code is built on Py MARL (Samvelyan et al., 2019) and the open-sourced code from LAGMA (Na & Moon, 2024). The paper mentions software platforms but does not specify version numbers for PyMARL or any other libraries like Python, PyTorch, or CUDA.
Experiment Setup Yes For VQ-VAE training, we use the fixed hyperparameters for all tasks, such as λvq=0.25, λcommit=0.125, λcvr=0.125 in Eq. (5), nψ=500, and nvq freq=10. Here, nψ is the update interval for clustering and classifier learning, and nvq freq represents the update interval of VQ-VAE. ... Table 4: Hyperparameter settings for TRAMA experiments.