reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching

Authors: Zijing Ou, Mingtian Zhang, Andi Zhang, Tim Xiao, Yingzhen Li, David Barber

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To support our theoretical discussion, we first evaluate the performance of optimal covariance matching by training diffusion probabilistic models on 2D toy examples. We then demonstrate its effectiveness in enhancing image modelling in both pixel and latent spaces, focusing on the comparison between optimal covariance matching and other covariance estimation methods, and showing that the proposed approach has the potential to scale to large image generation tasks.
Researcher Affiliation	Academia	1Imperial College London, 2University College London, 3University of Manchester, 4Max Planck Institute for Intelligent Systems, Tubingen 5University of Tubingen, 6IMPRS-IS.
Pseudocode	Yes	Algorithm 1 Sampling procedure from t t in OCM-DDPM Algorithm 2 Sampling procedure from t t in OCM-DDIM
Open Source Code	Yes	Code is available at: https://github.com/J-zin/OCM_DPM.
Open Datasets	Yes	In this experiment, we mainly focus on four datasets: CIFAR10 (Krizhevsky et al., 2009) with the linear schedule (LS) of βt (Ho et al., 2020) and the cosine schedule (CS) of βt (Nichol & Dhariwal, 2021); Celeb A (Liu et al., 2015); LSUN Bedroom (Yu et al., 2015).
Dataset Splits	Yes	The FID score is computed on 50K generated samples. Following Nichol & Dhariwal (2021); Bao et al. (2022a); Peebles & Xie (2023), the reference distribution statistics for FID are calculated using the full training set for CIFAR10 and Image Net, and 50K training samples for Celeb A and LSUN Bedroom.
Hardware Specification	Yes	We train our models using one A100-80G GPU for CIFAR10, Celeb A 64x64, and Image Net 64x64; four A100-80G GPUs for LSUN Bedroom; and eight A100-80G GPUs for Image Net 256x256.
Software Dependencies	No	The paper mentions optimizers like Adam W (Loshchilov, 2017) and Adam (Kingma & Ba, 2014), but does not specify software dependencies with version numbers (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup	Yes	We use the Adam W optimizer (Loshchilov, 2017) with a learning rate of 0.0001 and train for 500K iterations across all datasets. The batch sizes are set to 64 for LSUN Bedroom, 128 for CIFAR10, Celeb A 64x64, and Image Net 64x64, and 256 for Image Net 256x256.