reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Denoising Levy Probabilistic Models

Authors: Dario Shariatian, Umut Simsekli, Alain Oliviero Durmus

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that DLPM provides (i) better coverage of the tails of the data distribution, (ii) improved generation of unbalanced datasets, and (iii) faster computation times, requiring fewer backward steps.
Researcher Affiliation	Academia	1INRIA Department of Computer Science, PSL Research University, Paris, France 2École Polytechnique CMAP, IP Paris, Palaiseau, France EMAIL EMAIL
Pseudocode	Yes	A ALGORITHMS FOR DLPM AND DLIM In this section, we explicitly provide the algorithms needed to train and sample from the DLPM and DLIM generative methods. Algorithm 2 DLPM training simplified loss Algorithm 3 Stochastic sampling (DLPM) Algorithm 4 Deterministic sampling (DLIM)
Open Source Code	No	The paper mentions using a third-party implementation: "We use a U-Net following the implementation of Nichol & Dhariwal (2021) available in https://github.com/openai/improved-diffusion." However, there is no explicit statement or link provided for the authors' own source code for the methodology described in this paper.
Open Datasets	Yes	In our experiments on images, we make use of the dataset CIFAR10 LT (long tail), that has been introduced in Yoon et al. (2023) as an unbalanced modification of the CIFAR10 dataset. We work on the MNIST and the CIFAR10_LT dataset.
Dataset Splits	Yes	For our 2D datasets, we use 32000 datapoints for training, a batch size of 1024, and 25000 points for evaluation. CIFAR10_LT consists of the CIFAR10 images were artificial class unbalance has been introduced. The specific class counts we use are [5000, 2997, 1796, 1077, 645, 387, 232, 139, 83, 50].
Hardware Specification	Yes	All the training and experiments are conducted on four NVIDIA RTX8000 GPU and four NVIDIA V100 GPU
Software Dependencies	No	The paper mentions "Py Torch" for experiments and "Adam" as the optimizer, but no specific version numbers for any software dependencies are provided.
Experiment Setup	Yes	For our 2D datasets, we use 32000 datapoints for training, a batch size of 1024, and 25000 points for evaluation. We train each model for 10000 steps. ... The optimizer is Adam (Kingma & Ba (2017)) with learning rate 5e-3. We train MNIST for 120000 steps with batch size 256 with a time horizon T = 1000, and CIFAR_LT for 400000 steps with batch size 100 with a time horizon T = 4000. The optimizer is Adam (Kingma & Ba (2017)) with learning rate 1e-3 for MNIST and 2e-4 for CIFAR10_LT. We use the Step LR scheduler which scales the learning rate by γ = .99 every N = 1000 steps for CIFAR10_LT and N = 400 for MNIST. We use an exponential moving average with a rate of 0.99 for MNIST and 0.9999 for CIFAR10_LT.