reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions

Authors: Jinyoung Choi, Junoh Kang, Bohyung Han

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through a series of experiments, we show that the proposed method improves the quality of generated samples without requiring additional sampling iterations. Our experiments across various well-known baselines demonstrate that RX-DPM exhibits strong generalization performance and high practicality, regardless of ODE designs, model architectures, and base samplers. We conduct the experiment with EDM (Karras et al., 2022), Stable Diffusion V21 (Rombach et al., 2022), DPM-Solver (Lu et al., 2022), and PNDM (Liu et al., 2022) using their ofﬁcial implementations and provided pretrained models. For experiments with EDM, DPM-Solver, and PNDM as backbones, we generate 50K images and compute FID (Heusel et al., 2017) using the evaluation code provided in their implementations. To evaluate Stable Diffusion V2 results, we use the Py Torch implementation for the computation of FID2 and CLIP score3 with the patch size of 32 32.
Researcher Affiliation	Academia	Jinyoung Choi1, Junoh Kang1 & Bohyung Han1,2 Computer Vision Lab., 1ECE & 2IPAI, Seoul National University, Korea EMAIL
Pseudocode	Yes	Algorithm 1 summarizes the procedure of the proposed method with a generic ODE solver under the assumption that N is a multitude of k for simplicity; it is simple to handle the last few steps by either adjusting k for the remaining steps or skipping the extrapolation.
Open Source Code	Yes	The full implementation is available at https://github.com/jin01020/rx-dpm.
Open Datasets	Yes	We compare RX-Euler with other methods on four different datasets CIFAR-10 (Krizhevsky & Hinton, 2009), FFHQ (Karras et al., 2019), AFHQv2 (Choi et al., 2020), and Image Net (Deng et al., 2009) using the EDM (Karras et al., 2022) backbone. For evaluation, we generate 10K 512 512 images from unique text prompts in the COCO2014 (Lin et al., 2014) validation set and compute FID and CLIP scores on resized 256 256 images. Table 2 presents the effectiveness of RX-DPM when applied to DPM-Solvers (Lu et al., 2022) on CIFAR-10 and LSUN Bedroom (Yu et al., 2015). The results on the CIFAR-10, Celeb A (Liu et al., 2015), and LSUN Church (Yu et al., 2015) datasets are presented in Table 3.
Dataset Splits	Yes	For experiments with EDM, DPM-Solver, and PNDM as backbones, we generate 50K images and compute FID (Heusel et al., 2017) using the evaluation code provided in their implementations. To evaluate Stable Diffusion V2 results, we use the Py Torch implementation for the computation of FID2 and CLIP score3 with the patch size of 32 32. For evaluation, we generate 10K 512 512 images from unique text prompts in the COCO2014 (Lin et al., 2014) validation set and compute FID and CLIP scores on resized 256 256 images.
Hardware Specification	Yes	For measurements, we set the batch size to 128 and use 10-step sampling on an A6000 GPU.
Software Dependencies	No	We conduct the experiment with EDM (Karras et al., 2022), Stable Diffusion V21 (Rombach et al., 2022), DPM-Solver (Lu et al., 2022), and PNDM (Liu et al., 2022) using their ofﬁcial implementations and provided pretrained models. To evaluate Stable Diffusion V2 results, we use the Py Torch implementation for the computation of FID2 and CLIP score3 with the patch size of 32 32. No specific software versions are provided for PyTorch or other libraries used in the official implementations.
Experiment Setup	Yes	Throughout all experiments, we retain the default settings from the ofﬁcial codebases, except for additional hyperparameters related to the proposed method. For measurements, we set the batch size to 128 and use 10-step sampling on an A6000 GPU.