WarpDrive: Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU
Authors: Tian Lan, Sunil Srinivasa, Huan Wang, Stephan Zheng
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark Warp Drive in three environments: discrete and continuous versions of the game of Tag (similar to Predator-Prey (Lowe et al., 2017) and Pursuit (Zheng et al., 2017)) and a more complex COVID-19 economic simulation (Trott et al., 2021). All experiments were run on the Google Cloud Platform. We see that Warp Drive running on a single GPU machine, a2-highgpu-1g (https://cloud.google.com/compute/docs/gpus# a100-gpus) scales almost linearly to thousands of environments and agents, and yields orders of magnitude faster MARL compared to a CPU implementation on the n1-standard-16 (https://cloud.google.com/compute/docs/general-purpose-machines#n1_machines). Figure 3 (left) shows how Warp Drive s performance scales in discrete Tag: it scales linearly to over thousands of environments (with 5 agents) and yields almost perfect parallelism. |
| Researcher Affiliation | Industry | Tian Lan EMAIL Sunil Srinivasa EMAIL Huan Wang EMAIL Stephan Zheng EMAIL Salesforce Research Palo Alto, CA 94301, USA |
| Pseudocode | No | The paper describes the architecture and components of Warp Drive, but it does not include any structured pseudocode or algorithm blocks for specific algorithms. |
| Open Source Code | Yes | Warp Drive is a flexible, lightweight, and easy-to-use open-source framework for end-to-end deep multi-agent reinforcement learning (MARL) on a Graphics Processing Unit (GPU), available at https://github.com/salesforce/warp-drive. |
| Open Datasets | No | The paper benchmarks Warp Drive in simulation environments like 'game of Tag' and a 'COVID-19 economic simulation (Trott et al., 2021)'. While 'Trott et al., 2021' links to a GitHub repository for a simulation, the paper does not explicitly provide access information for a pre-existing, publicly available dataset in the traditional sense, but rather describes simulation environments. |
| Dataset Splits | No | The paper focuses on simulation environments and their performance within the Warp Drive framework, not on traditional datasets with training/test/validation splits. Therefore, no information on dataset splits is provided. |
| Hardware Specification | Yes | All experiments were run on the Google Cloud Platform. We see that Warp Drive running on a single GPU machine, a2-highgpu-1g (https://cloud.google.com/compute/docs/gpus# a100-gpus) scales almost linearly to thousands of environments and agents, and yields orders of magnitude faster MARL compared to a CPU implementation on the n1-standard-16 (https://cloud.google.com/compute/docs/general-purpose-machines#n1_machines). |
| Software Dependencies | No | The paper mentions 'Warp Drive builds on CUDA' and it is 'compatible with Py Torch' and uses 'gym-style multi-agent environment' but does not specify any version numbers for these software components. Specific version numbers are required for reproducibility. |
| Experiment Setup | No | The paper focuses on the performance and architecture of the Warp Drive framework itself, detailing how it handles environments and agents in parallel. However, it does not provide specific experimental setup details for training individual RL models, such as hyperparameters (learning rates, batch sizes, optimizers, etc.) for the agents within the simulations. |