Deep Surrogate Assisted Generation of Environments
Authors: Varun Bhatt, Bryon Tjanaka, Matthew Fontaine, Stefanos Nikolaidis
NeurIPS 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms in discovering collections of environments that elicit diverse behaviors of a state-of-the-art RL agent and a planning agent. Our source code and videos are available at https://dsagepaper.github.io/ |
| Researcher Affiliation | Academia | Varun Bhatt University of Southern California Los Angeles, CA EMAIL Bryon Tjanaka University of Southern California Los Angeles, CA EMAIL Matthew C. Fontaine University of Southern California Los Angeles, CA EMAIL Stefanos Nikolaidis University of Southern California Los Angeles, CA EMAIL |
| Pseudocode | Yes | Algorithm 1: Deep Surrogate Assisted Generation of Environments (DSAGE) |
| Open Source Code | Yes | Our source code and videos are available at https://dsagepaper.github.io/ |
| Open Datasets | Yes | We test our algorithms in two benchmark domains from prior work: a Maze domain [82, 3, 4] with a trained ACCEL agent [4] and a Mario domain [83, 16] with an A* agent [22]. [...] The Maze domain is based on the MiniGrid environment [82]. [...] The Mario domain is based on the Mario AI Framework [83, 94]. |
| Dataset Splits | No | The paper does not specify explicit training/validation/test dataset splits for the dynamically generated dataset 'D' used by the DSAGE algorithm or for the training of the surrogate model during the main experimental runs. It mentions creating a 'combined dataset' for post-hoc evaluation of surrogate models, but not for the primary experimental setup. |
| Hardware Specification | Yes | One of the GPUs used in the experiments was awarded by the NVIDIA Academic Hardware Grant. [...] All experiments were run on computers with Intel Core i9-9900K and NVIDIA GeForce RTX 2080 Ti GPUs. |
| Software Dependencies | Yes | Our implementation is in Python 3.8 with PyTorch 1.10. We use pyribs [99] for QD optimization. |
| Experiment Setup | Yes | We train the CNN surrogate models for 100 epochs using the Adam optimizer [100] with a learning rate of 0.001. The Adam optimizer s hyperparameters are set to β1 = 0.9, β2 = 0.999, = 10−8, and weight decay is set to 0.0. We use a batch size of 32 for the surrogate model. [...] For MAP-Elites, we use a batch size of 20, and for CMA-ME, we use a batch size of 10. [...] the inner loop for DSAGE is run for Nexploit = 1000 iterations. |