reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simplifying Control Mechanism in Text-to-Image Diffusion Models

Authors: Zhida Feng, Li Chen, Yuenan Sun, Jiaxiang Liu, Shikun Feng

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments confirm that Simple-Control Net matches and surpasses Control Net s performance across a broad range of tasks and base diffusion models, showcasing its utility and efficiency.
Researcher Affiliation	Collaboration	Zhida Feng1,2,3, Li Chen1,2,*, Yuenan Sun1,2, Jiaxiang Liu3, Shikun Feng3 1School of Computer Science and Technology, Wuhan University of Science and Technology 2Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan University of Science and Technology 3Baidu Inc.
Pseudocode	No	The paper describes the architecture and mathematical formulations but does not contain a distinct pseudocode block or algorithm section.
Open Source Code	Yes	Code https://github.com/feng-zhida/Simple-Control Net
Open Datasets	Yes	We sampled 2 million text-image pairs from the COYO-700M (Byeon et al. 2022) dataset for training. Our evaluation set comprised 10,000 image-text pairs sampled from the COCO (Lin et al. 2014) val2014 dataset.
Dataset Splits	Yes	We sampled 2 million text-image pairs from the COYO-700M (Byeon et al. 2022) dataset for training. All models have trained over 40,000 iterations with a batch size of 128 using the Adam W (Loshchilov and Hutter 2019) optimizer, with settings β1 = 0.9 and β2 = 0.999. Our evaluation set comprised 10,000 image-text pairs sampled from the COCO (Lin et al. 2014) val2014 dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies	No	The paper mentions several tools and models like Adam W optimizer, DPM-Solver, and Stable Diffusion v1-5, but it does not specify version numbers for ancillary software dependencies like programming languages or libraries (e.g., Python, PyTorch).
Experiment Setup	Yes	All models have trained over 40,000 iterations with a batch size of 128 using the Adam W (Loshchilov and Hutter 2019) optimizer, with settings β1 = 0.9 and β2 = 0.999. We applied Lo RA (Low Rank Adaptation) (Hu et al. 2021) across all self-attention layers, employing a rank of 8, and set the Lo RA dropout to 0.1. All models, including ours, use the DPM-Solver (Lu et al. 2022) configured for 25 steps with a control strength of 1.0. When using CFG, we set the guidance scale to 7.5 for all models, which is a default setting in Stable Diffusion.