reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Geometric Framework for Understanding Memorization in Generative Models

Authors: Brendan Ross, Hamidreza Kamkari, Tongzi Wu, Rasa Hosseinzadeh, Zhaoyan Liu, George Stein, Jesse Cresswell, Gabriel Loaiza-Ganem

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate the MMH using synthetic data and image datasets up to the scale of Stable Diffusion, developing new tools for detecting and preventing generation of memorized samples in the process.
Researcher Affiliation	Industry	Layer 6 AI EMAIL
Pseudocode	No	The paper describes methods and propositions through mathematical formulations and textual descriptions, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	To ensure the reproducibility of our experiments, we provide two codebase links. The first codebase, accessible at github.com/layer6ai-labs/dgm_geometry, contains our small-scale synthetic experiments and our CIFAR10 experiments. The second, accessible at github.com/layer6ai-labs/diffusion_memorization/, extends the work of Wen et al. (2023) to use the MMH to detect and mitigate memorization.
Open Datasets	Yes	We analyze the higher-dimensional CIFAR10 dataset (Krizhevsky & Hinton, 2009) and use two pretrained generative models... we retrieve memorized LAION (Schuhmann et al., 2022) training images identified by Webster (2023)... a mix of 2000 images sampled from LAION Aesthetics 6.5+, 2000 sampled from COCO (Lin et al., 2014), and all 251 images from the Tuxemon dataset (Tuxemon Project, 2024; Hugging Face, 2024). All datasets used in our experiments are freely available from the referenced sources and are utilized in compliance with their respective licenses.
Dataset Splits	No	The paper states that it uses pretrained generative models on CIFAR10 and Stable Diffusion. It describes how samples were generated and selected for analysis (e.g., 'generate 50,000 images', 'take the closest 250 neighbours'), but does not provide specific training, validation, and test splits for reproducing the training of a model within the scope of this paper.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or memory) used for running its experiments.
Software Dependencies	No	The paper mentions using the 'cv2 package' for PNG compression but does not provide specific version numbers for it or any other key software dependencies used in their experimental setup, other than linking to codebases.
Experiment Setup	No	The paper describes experimental methodologies for LID estimation and mitigation approaches, including some hyperparameters for LID estimation (e.g., t0 values for FLIPD and NB, k for Local PCA). However, it does not provide a comprehensive set of hyperparameters or system-level training settings for reproducing the full experimental setup of the generative models themselves or a detailed table of experimental configurations.