reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Probing the Latent Hierarchical Structure of Data via Diffusion Models

Authors: Antonio Sclocchi, Alessandro Favero, Noam Levi, Matthieu Wyart

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Remarkably, we confirm this prediction in both text and image datasets using state-of-the-art diffusion models. Our results show how latent variable changes manifest in the data and establish how to measure these effects in real data using diffusion models.
Researcher Affiliation	Academia	Antonio Sclocchi Institute of Physics, EPFL Alessandro Favero Institute of Physics, EPFL Noam Itzhak Levi Institute of Physics, EPFL Matthieu Wyart Department of Physics and Astronomy, Johns Hopkins
Pseudocode	No	The paper describes algorithms such as Belief Propagation and diffusion processes in a structured text format and with equations, but does not present them within a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	No	The paper does not contain an explicit statement about releasing code or a direct link to a code repository for the methodology described.
Open Datasets	Yes	We perform forward-backward experiments with state-of-the-art masked diffusion language models (MDLM) (Sahoo et al., 2024) on Wiki Text. ... Vision diffusion models We extend our analysis to computer vision by considering Improved Denoising Diffusion Probabilistic Models (Nichol & Dhariwal, 2021), trained on the Image Net dataset.
Dataset Splits	Yes	We present the average correlation functions and the susceptibility for vision DDPMs, starting from samples of the Image Net validation set (Deng et al., 2009).
Hardware Specification	No	The paper mentions that experiments were run for language and vision diffusion models, but no specific hardware details (e.g., GPU models, CPU types, memory) are provided.
Software Dependencies	No	The paper mentions models and tools such as "GPT2 tokenizer", "CLIP Vi T-B32", "MDLM", and "Improved Denoising Diffusion Probabilistic Models", but it does not specify software versions for these or for any underlying programming languages or libraries (e.g., Python, PyTorch, CUDA versions) used for implementation.
Experiment Setup	Yes	The results are averaged over NS = 300 samples, each consisting of NT = 128 tokens, with NR = 50 noise realizations for each masking fraction. ... Data obtained with 344 starting images and 128 diffusion trajectories per starting image. ... we divide each image into 7 7 patches and use the last-layer embeddings for each patch from a CLIP Vi T-B32 (Radford et al., 2021) to tokenize the image.