reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Refine-by-Align: Reference-Guided Artifacts Refinement through Semantic Alignment

Authors: Yizhi Song, Liu He, Zhifei Zhang, Soo Ye Kim, HE Zhang, Wei Xiong, Zhe Lin, Brian Price, Scott Cohen, Jianming Zhang, Daniel Aliaga

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments and comparisons demonstrate that our pipeline greatly pushes the boundary of ﬁne details in the image synthesis models. Project page: https://song630.github.io/Reﬁne-by-Align-Project-Page/ 1 INTRODUCTION ... Quantitative and qualitative comparisons (i.e., using several wellestablished metrics and a user study; Sec. 4.2, Sec. 4.3) show that in terms of detail and appearance preservation our model outperforms all six baseline models
Researcher Affiliation	Collaboration	Yizhi Song1 Liu He1 Zhifei Zhang2 Soo Ye Kim2 He Zhang2 Wei Xiong2 Zhe Lin2 Brian Price2 Scott Cohen2 Jianming Zhang2 Daniel Aliaga1 1 Purdue University 2 Adobe Research
Pseudocode	Yes	Algorithm 1 Optimal Cross-Attention Alignment (refer to Fig. 4 for visualization)
Open Source Code	No	Project page: https://song630.github.io/Reﬁne-by-Align-Project-Page/
Open Datasets	Yes	To provide insight on the appearance of generative artifacts and an effective evaluation of our artifacts reﬁnement approach, we present Gen Artifact Bench, the ﬁrst benchmark for referenceguided artifacts reﬁnement (refer to the Appendix for examples), featuring: We use Pixabay (Song et al., 2023) with panoptic segmentation labels as the training dataset. Our training data consists of Pixabay and MVObj, a manually annotated dataset where an object appears in multiple images with different contexts and views.
Dataset Splits	No	The paper mentions using 'Pixabay' as a training dataset and 'Gen Artifact Bench' for evaluation, but it does not specify any training, validation, or test split percentages, sample counts, or methodology for data partitioning.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions several models and frameworks like DINOv2, Any Door, IMPRINT, Stable Diffusion, and BLIP2, but it does not specify version numbers for any software libraries, programming languages, or other dependencies.
Experiment Setup	No	The paper states: "Parameters. t = 0 and l = 9 are used in all the comparisons below." and discusses an optimal combination of these parameters derived from a grid search. However, it does not provide common training hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings for the neural network training.