reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Deep Surrogate Assisted Generation of Environments

Authors: Varun Bhatt, Bryon Tjanaka, Matthew Fontaine, Stefanos Nikolaidis

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms in discovering collections of environments that elicit diverse behaviors of a state-of-the-art RL agent and a planning agent. Our source code and videos are available at https://dsagepaper.github.io/
Researcher Affiliation	Academia	Varun Bhatt University of Southern California Los Angeles, CA EMAIL Bryon Tjanaka University of Southern California Los Angeles, CA EMAIL Matthew C. Fontaine University of Southern California Los Angeles, CA EMAIL Stefanos Nikolaidis University of Southern California Los Angeles, CA EMAIL
Pseudocode	Yes	Algorithm 1: Deep Surrogate Assisted Generation of Environments (DSAGE)
Open Source Code	Yes	Our source code and videos are available at https://dsagepaper.github.io/
Open Datasets	Yes	We test our algorithms in two benchmark domains from prior work: a Maze domain [82, 3, 4] with a trained ACCEL agent [4] and a Mario domain [83, 16] with an A* agent [22]. [...] The Maze domain is based on the MiniGrid environment [82]. [...] The Mario domain is based on the Mario AI Framework [83, 94].
Dataset Splits	No	The paper does not specify explicit training/validation/test dataset splits for the dynamically generated dataset 'D' used by the DSAGE algorithm or for the training of the surrogate model during the main experimental runs. It mentions creating a 'combined dataset' for post-hoc evaluation of surrogate models, but not for the primary experimental setup.
Hardware Specification	Yes	One of the GPUs used in the experiments was awarded by the NVIDIA Academic Hardware Grant. [...] All experiments were run on computers with Intel Core i9-9900K and NVIDIA GeForce RTX 2080 Ti GPUs.
Software Dependencies	Yes	Our implementation is in Python 3.8 with PyTorch 1.10. We use pyribs [99] for QD optimization.
Experiment Setup	Yes	We train the CNN surrogate models for 100 epochs using the Adam optimizer [100] with a learning rate of 0.001. The Adam optimizer s hyperparameters are set to β1 = 0.9, β2 = 0.999, = 10−8, and weight decay is set to 0.0. We use a batch size of 32 for the surrogate model. [...] For MAP-Elites, we use a batch size of 20, and for CMA-ME, we use a batch size of 10. [...] the inner loop for DSAGE is run for Nexploit = 1000 iterations.