reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Ensembling Diffusion Models via Adaptive Feature Aggregation

Authors: Cong Wang, kuan tian, Yonghang Guan, Fei Shen, Zhiwei Jiang, Qing Gu, Jun Zhang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct both quantitative and qualitative experiments, demonstrating that our AFA outperforms the base models and the baseline methods in both superior quality and context alignment. ... 4 EXPERIMENTS
Researcher Affiliation	Collaboration	1 State Key Laboratory for Novel Software Technology, Nanjing University 2 Tencent AIPD EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	A ALGORITHM OF AFA The forward process of AFA can be seen in Alg. 1. Algorithm 1 One forward process of Adaptive Feature Aggregation (AFA).
Open Source Code	Yes	1The code is available at https://github.com/tenvence/afa.
Open Datasets	Yes	We evaluate the models with two datasets, which are COCO 2017 (Lin et al., 2014) and Draw Bench Prompts (Saharia et al., 2022), respectively. ... Our AFA framework is trained on 10,000 samples from the dataset Journey DB (Pan et al., 2023) for 10 epochs with batch size 8. ... To validate the generality of our AFA, we evaluate it on additional datasets, which are Diffusion DB (Wang et al., 2022), Journey DB (Pan et al., 2023), and LAION-COCO22, respectively.
Dataset Splits	Yes	COCO 2017 comprises 118,287 and 5,000 image-caption pairs in the test and validation sets. All the models generate images with a resolution of 256 256. We apply four metrics to evaluate the generation performance, which are Fr echet Inception Distance (FID), Inception Score (IS), CLIP-I, and CLIP-T, respectively. FID and IS are applied to the test set, while CLIP-I and CLIP-T are applied to the validation set.
Hardware Specification	Yes	We train our AFA in an environment equipped with 8 NVIDIA V100 GPUs, each with 32GB of memory.
Software Dependencies	No	The paper mentions using Adam W as the optimizer but does not specify software dependencies like Python, PyTorch, or CUDA with version numbers.
Experiment Setup	Yes	Adam W (Loshchilov & Hutter, 2017) is used as the optimizer with a learning rate of 0.0001 and a weight decay of 0.01. ... Our AFA framework is trained on 10,000 samples from the dataset Journey DB (Pan et al., 2023) for 10 epochs with batch size 8. To enable CFG, we use a probability of 0.1 to drop textual prompts. ... The CFG weight βCFG is set to 7.5. We evaluate the models with two datasets... For a fair comparison, all the methods generate 4 images by DDIM Ho et al. (2020) for 50 inference steps.