reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Moonshine: Distilling Game Content Generators into Steerable Generative Models

Authors: Yuhe Nie, Michael Middleton, Tim Merino, Nidhushan Kanagaraja, Ashutosh Kumar, Zhan Zhuang, Julian Togelius

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our distilled models with the baseline constructive algorithm. Our analysis of the variety, accuracy, and quality of our generation demonstrates the efficacy of distilling constructive methods into controllable text-conditioned PCGML models. The outputs of these models are evaluated through both quantitative and qualitative metrics.
Researcher Affiliation	Academia	1New York University Game Innovation Lab, Brooklyn, New York, USA 2Southern University of Science and Technology, Shenzhen, China 3City University of Hong Kong, Kowloon, Hong Kong SAR China
Pseudocode	No	The paper describes the architecture of the Five-Dollar Model and Discrete Diffusion Model, and the steps for synthetic data generation, but does not present them in structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions that the dataset is open-sourced at huggingface.co/datasets/Dolphin Nie/dungeon-dataset, but it does not provide a concrete access link or explicit statement for the source code of the methodology itself.
Open Datasets	Yes	Datasets huggingface.co/datasets/Dolphin Nie/dungeon-dataset
Dataset Splits	Yes	We create a dataset of maps from the game, split into 49,000 training points, 14,000 test points, and 7,000 validation points.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments.
Software Dependencies	No	The paper mentions the use of specific models like 'gte-large-en-v1.5' and 'Open AI s GPT-4 Turbo (gpt-4-turbo-2024-04-09)' but does not provide specific version numbers for ancillary software libraries or frameworks used in their implementation.
Experiment Setup	No	The paper describes the model architectures and loss functions used for the Five-Dollar Model and Discrete Diffusion Model, but it does not provide specific hyperparameter values such as learning rates, batch sizes, number of epochs, or optimizer settings for training.