MADRaS : Multi Agent Driving Simulator
Authors: Anirban Santara, Sohan Rudra, Sree Aditya Buridi, Meha Kaushik, Abhishek Naik, Bharat Kaul, Balaraman Ravindran
JAIR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we present the results of six experiments on single and multi-agent RL for learning to drive in MADRa S. Table 4 presents a brief outline of our experiments and their individual motivations. |
| Researcher Affiliation | Collaboration | Anirban Santara EMAIL Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, WB, India; Meha Kaushik EMAIL Microsoft, Vancouver, Canada |
| Pseudocode | No | The paper describes the implementation of a PID controller and mentions algorithms like Proximal Policy Optimization (PPO), but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | MADRa S is open source2 and aims to contribute to the democratization of artiļ¬cial intelligence. 2. Code available at https://github.com/madras-simulator/MADRa S |
| Open Datasets | No | The paper uses the MADRa S simulator, which is built on TORCS and inherits its assets (tracks, cars), but does not explicitly mention using or providing concrete access information for a separate, pre-existing open dataset for experimental training or evaluation. |
| Dataset Splits | No | The paper conducts experiments in a simulated environment and evaluates agents over a number of episodes (e.g., 'estimated over at least 100 episodes'), rather than using pre-defined training/test/validation splits from a static dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'RLLib' for PPO implementation and 'Open AI Gym' for its interface, but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | Unless otherwise stated, we set the learning rate to 5e-5. The policy and value functions are modelled using fully connected neural networks with 2 hidden layers and 256 tanh units in each layer. All experiments with the track-position speed action space have a PID latency of 5 time steps. The PID parameters used for track-position speed control are given in Table 3. |