IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning
Authors: Vindula Jayawardana, Baptiste Freydt, Ao Qu, Cameron Hickert, Zhongxia Yan, Cathy Wu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using these traffic scenarios, we benchmark popular multi-agent RL and human-like driving algorithms and demonstrate that the popular multi-agent RL algorithms struggle to generalize in CRL settings. |
| Researcher Affiliation | Academia | 1MIT, EMAIL 2ETH Zurich, EMAIL |
| Pseudocode | No | The paper describes the methodology and components of Intersection Zoo through textual descriptions and mathematical equations, such as Equation 2 for optimization and Equation 3 for reward definition, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and documentation are available at https://github.com/mit-wu-lab/Intersection Zoo. |
| Open Datasets | Yes | Intersection Zoo is built on data-informed simulations of 16,334 signalized intersections derived from 10 major US cities, modeled in an open-source industry-grade microscopic traffic simulator. By modeling factors affecting vehicular exhaust emissions (e.g., temperature, road conditions, travel demand), Intersection Zoo provides one million data-driven traffic scenarios. We use Open Street Maps (OSM) (Haklay & Weber, 2008) data and follow guidelines provided by Qu et al. (2022). Intersection lane lengths, lane counts, turn lane configurations, and speed limits are extracted from OSM. Road grades are taken from US geological surveys (Survey). To model the vehicle arrival process, we use the Annual Average Daily Traffic data (AADT) (Huntsinger, 2022) released by the Departments of Transportation of each state/city. We source vehicle age, fuel type, and vehicle type distributions from the openly available MOVES databases (epa) and data from US National Centers for Environmental Information (for Environmental Information) is used for atmospheric condition modeling with temperature and humidity changes. with real-world arterial driving data from City Sim (Zheng et al., 2022). |
| Dataset Splits | Yes | By default, Intersection Zoo provide interfaces for train/test split evaluations to measure generalization, which is often used with zero-shot policy transfer (Harrison et al., 2019; Higgins et al., 2017; Kirk et al., 2021). This means we train policies on one subset of context MDPs and test on another subset of context MDPs. This includes both IID and OOD evaluation protocols. Hence, OOD evaluation can be performed by training in one city (train CMDP) and testing in another city (test CMDP). Similarly, IID testing can be performed by train/test split of context-MDPs within a given city. |
| Hardware Specification | Yes | Experiments were carried out in a computing cluster with 20 CPUs and an NVidia Volta V100 GPU with 32GB RAM. |
| Software Dependencies | No | The paper mentions using RLLib and SUMO as software dependencies, but it does not provide specific version numbers for these tools. For example, 'All experiments are carried out using RLLib (Liang et al., 2018) with the default hyperparameter configuration.' and 'All traffic scenarios are configured for use in the open-source agent-based traffic simulator SUMO (Lopez et al., 2018).' |
| Experiment Setup | Yes | All experiments are carried out using RLLib (Liang et al., 2018) with the default hyperparameter configuration. We leverage 10 multiple workers in training the multi-task learning policies. Each benchmarking run took roughly 24 hours in RLLib, with 5000 episodes (each with a horizon of 1000 steps with 50 warmups). For the reported results in Section 6, for each algorithm, we train with four random seeds. We train for 500 training iterations to ensure policies are well-converged. |