Ensembling Diffusion Models via Adaptive Feature Aggregation

Authors: Cong Wang, kuan tian, Yonghang Guan, Fei Shen, Zhiwei Jiang, Qing Gu, Jun Zhang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct both quantitative and qualitative experiments, demonstrating that our AFA outperforms the base models and the baseline methods in both superior quality and context alignment. ... 4 EXPERIMENTS
Researcher Affiliation Collaboration 1 State Key Laboratory for Novel Software Technology, Nanjing University 2 Tencent AIPD EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes A ALGORITHM OF AFA The forward process of AFA can be seen in Alg. 1. Algorithm 1 One forward process of Adaptive Feature Aggregation (AFA).
Open Source Code Yes 1The code is available at https://github.com/tenvence/afa.
Open Datasets Yes We evaluate the models with two datasets, which are COCO 2017 (Lin et al., 2014) and Draw Bench Prompts (Saharia et al., 2022), respectively. ... Our AFA framework is trained on 10,000 samples from the dataset Journey DB (Pan et al., 2023) for 10 epochs with batch size 8. ... To validate the generality of our AFA, we evaluate it on additional datasets, which are Diffusion DB (Wang et al., 2022), Journey DB (Pan et al., 2023), and LAION-COCO22, respectively.
Dataset Splits Yes COCO 2017 comprises 118,287 and 5,000 image-caption pairs in the test and validation sets. All the models generate images with a resolution of 256 256. We apply four metrics to evaluate the generation performance, which are Fr echet Inception Distance (FID), Inception Score (IS), CLIP-I, and CLIP-T, respectively. FID and IS are applied to the test set, while CLIP-I and CLIP-T are applied to the validation set.
Hardware Specification Yes We train our AFA in an environment equipped with 8 NVIDIA V100 GPUs, each with 32GB of memory.
Software Dependencies No The paper mentions using Adam W as the optimizer but does not specify software dependencies like Python, PyTorch, or CUDA with version numbers.
Experiment Setup Yes Adam W (Loshchilov & Hutter, 2017) is used as the optimizer with a learning rate of 0.0001 and a weight decay of 0.01. ... Our AFA framework is trained on 10,000 samples from the dataset Journey DB (Pan et al., 2023) for 10 epochs with batch size 8. To enable CFG, we use a probability of 0.1 to drop textual prompts. ... The CFG weight βCFG is set to 7.5. We evaluate the models with two datasets... For a fair comparison, all the methods generate 4 images by DDIM Ho et al. (2020) for 50 inference steps.