reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model

Authors: Jincheng Zhong, XiangCheng Zhang, Jianmin Wang, Mingsheng Long

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate its substantial effectiveness across various transfer benchmarks, achieving over a 19.6% improvement in FID and a 23.4% improvement in FDDINOv2 compared to standard fine-tuning. Notably, existing fine-tuned models can seamlessly integrate Domain Guidance to leverage these benefits, without additional training. Experimentally, we evaluate Do G across seven well-established transfer learning benchmarks, providing quantitative and qualitative evidence to substantiate its efficacy. Our comprehensive ablation study further underscores its superiority in the transfer of pre-trained diffusion models.
Researcher Affiliation	Academia	Jincheng Zhong, Xiangcheng Zhang, Jianmin Wang, Mingsheng Long School of Software, BNRist, Tsinghua University, China EMAIL, EMAIL
Pseudocode	No	The paper includes mathematical formulations (Equation 1 to 10) in sections like 'Diffusion formulation' and 'Classifier-free guidance' and 'Domain Guidance', but it does not contain explicitly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	Code is available at this repository: https://github.com/thuml/Domain Guidance.
Open Datasets	Yes	Our benchmark setups include 7 fine-grained downstream datasets: Food101 (Bossard et al., 2014), SUN397 (Xiao et al., 2010), DF20-Mini (Picek et al., 2022), Caltech101 (Griffin et al., 2007), CUB-200-2011 (Wah et al., 2011), Art Bench-10 (Liao et al., 2022), and Stanford Cars (Krause et al., 2013).
Dataset Splits	Yes	We fine-tune our domain model on a random partition of the whole dataset with 76,128 training images, 10,875 validation images and 21,750 test images. ... The Stanford Cars dataset, there are 16,185 images that display 196 distinct classes of cars. These images are divided into a training and a testing set: 8,144 images for training and 8,041 images for testing. ... Art Bench-10 ... It contains 5,000 training images and 1,000 testing images per style.
Hardware Specification	Yes	Each fine-tuning task is executed on a single NVIDIA A100 40GB GPU over approximately 6 hours.
Software Dependencies	No	All of our experiments are inplemented using Py Torch and conducted on NVIDIA A100 40G GPUs. However, no specific version numbers for PyTorch or other software dependencies are provided.
Experiment Setup	Yes	We perform fine-tuning for 24,000 steps with a batch size of 32 at 256 256 resolution for all benchmarks. The standard fine-tuned models are trained in a CFG style, with a label dropout ratio of 10%. ... we generate 10,000 images with 50 sampling steps per benchmark, setting the guidance weights for both CFG and Do G to 1.5. ... Table 5: Hyperparameter of domain transfer experiments -- Backbone Di T-XL/2, Image Size 256, Batch Size 32, Learning Rate 1e-4, Optimizer Adam, Training Steps 24,000, Validation Interval 24,000, Sampling Steps 50.