reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

Authors: Jiaxiang Cheng, Pan Xie, Xin Xia, Jiashi Li, Jie Wu, Yuxi Ren, Huixia Li, Xuefeng Xiao, Shilei Wen, Lean Fu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we introduce the experimental setup and results. First, we describe the experimental setup in detail, including training details, evaluation metrics, and the selection of personalized models. And we show the main experimental results. We compare Res Adapter with other multiresolution image generation models as well as the original personalized model. Then we show the extended experimental results. That is the application of Res Adapter in combination with other modules. Finally, we perform ablation experiments about Res Adapter modules and alpha.
Researcher Affiliation	Industry	Jiaxiang Cheng, Pan Xie*, Xin Xia, Jiashi Li, Jie Wu, Yuxi Ren, Huixia Li, Xuefeng Xiao, Shilei Wen, Lean Fu, Byte Dance Inc., Beijing, China EMAIL, EMAIL
Pseudocode	No	The paper describes methods and strategies but does not include any explicitly labeled pseudocode or algorithm blocks. The pipeline in Figure 2 is a diagram, not pseudocode.
Open Source Code	Yes	Code https://github.com/bytedance/res-adapter
Open Datasets	Yes	We train Res Adapter using the large-scale dataset LAION-5B (Schuhmann et al. 2022). For experiments comparing Res Adapter and the other multi-resolution image generation models, we refer to (Haji-Ali, Balakrishnan, and Ordonez 2024) and use Fréchet Inception Distance (FID) (Heusel et al. 2017) and CLIP Score (Hessel et al. 2021) as evaluation metrics. They evaluate the quality of the generated images and the degree of alignment between the generated images and prompts. For other multi-resolution generation models, we chose Multi Diffusion (MD) (Bar-Tal et al. 2023) and Elastic Diffusion (ED) as baselines. Personalized Models. In order to demonstrate the effectiveness of our Res Adapter, we choose multiple personalized models from Civitai (Civitai 2022), which cover a wide domains range from animation to realistic photography.
Dataset Splits	No	The paper describes training on mixed datasets with various resolutions and aspect ratios, and using a probability function to sample images for training. It also mentions evaluating on LAION-COCO. However, it does not provide specific train/test/validation splits (percentages or counts) for either LAION-5B (used for training) or LAION-COCO (used for evaluation), nor does it reference standard predefined splits with citations for its own experimental setup.
Hardware Specification	Yes	Since Res Adapter is only 0.5M of trainable parameters, we train it for less than an hours on 8 A100 GPUs.
Software Dependencies	No	The paper mentions using the Adam W optimizer and specific base models (SD1.5, SDXL1.0) but does not provide specific version numbers for software libraries, programming languages, or other ancillary software components used for implementation.
Experiment Setup	Yes	For SD1.5 and SDXL, we both use a batch size of 32 and a learning rate of 1e-4 for training. We use the Adam W optimizer (Kingma and Ba 2015) with β1 = 0.95, β2 = 0.99. The total number of training steps is 20,000.