reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Anti-Exposure Bias in Diffusion Models

Authors: Junyu Zhang, Daochang Liu, Eunbyung Park, Shichao Zhang, Chang Xu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results on various DMs demonstrate the superiority of our prompt learning framework across three benchmark datasets.
Researcher Affiliation	Academia	Junyu Zhang School of Computer Science and Engineering Central South University Daochang Liu School of Physics, Mathematics and Computing University of Western Australia Eunbyung Park Department of Artificial Intelligence, Yonsei University Shichao Zhang School of Computer Science and Engineering Guangxi Normal University Chang Xu School of Computer Science University of Sydney
Pseudocode	Yes	Algorithm 1: Anti-Bias Sampling Data: pre-trained DM sθ, optimized prompt prediction model υϕ , default sampler S, pre-defined noise schedule L = {σt0, . . . , σt T }, total sampling steps T Result: New Images xanti bias t0 1 sample a batch of x T from a prior distribution π; 2 xtemp = x T ; 3 for ti t T to t0 do 4 ˆxti = S(sθ, σti, xtemp); 5 xanti bias ti = υϕ (ˆxti) + ˆxti; This anti-bias rectification is the only difference compared to the original sampling schedule. 6 xtemp = xanti bias ti ; 7 xanti bias t0 = xtemp;
Open Source Code	Yes	Our code is available at: Anti Exposure Bias.
Open Datasets	Yes	To evaluate the effectiveness of our prompt learning framework in reducing exposure bias, we conduct experiments on three benchmark datasets: CIFAR-10 (Krizhevsky et al., 2009), Celeb A 64 64 (Liu et al., 2015), and Image Net 256 256
Dataset Splits	No	The paper mentions using CIFAR-10 (Krizhevsky et al., 2009), Celeb A 64 64 (Liu et al., 2015), and Image Net 256 256. However, it does not explicitly provide training/test/validation dataset splits, specific percentages, sample counts, or citations to predefined splits used for these experiments.
Hardware Specification	Yes	To train the model, we allocate A100 GPUs to optimize them and test the experimental results on just one A100 GPU. Specifically, we employ 8 A100 GPUs for training CIFAR-10 and Celeb A, while we use 16 A100 GPUs for training on Image Net. Additionally, we allocate only 4 A100 GPUs for training the prompt prediction model for latent diffusion.
Software Dependencies	No	The paper mentions following the EDM framework and using NCSN++ as a backbone but does not specify any software names with version numbers (e.g., Python, PyTorch, CUDA versions) used for their implementation.
Experiment Setup	Yes	During training, we set the batch size to 1024 for all experiments and keep other hyperparameters the same as in EDM training. Detailed training settings can be found in (Karras et al., 2022), and we maintain the default values. ... For most experiments, the training iterations range from 100k to 150k across all datasets... Specifically, we set the EMA value to 0.9999 when training our model on CIFAR-10, and we set the EMA to 0.999943 for LSUN and Image Net.