reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Training-Free Image Manipulation Localization Using Diffusion Models

Authors: Zhenfei Zhang, Ming-Ching Chang, Xin Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments were conducted using sixteen stateof-the-art (So TA) methods across six IML datasets. The results demonstrate that our training-free method outperforms So TA unsupervised and weakly-supervised techniques.
Researcher Affiliation	Academia	Zhenfei Zhang, Ming-Ching Chang, Xin Li Department of Computer Science, University at Albany, State University of New York, New York, USA, 12222 EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the method using mathematical formulas and text, and provides an overview figure (Fig. 2). No explicitly labeled pseudocode or algorithm blocks are present.
Open Source Code	No	The paper mentions that the 'selected methods have open-source code' and that the authors used 'their open-source code' for comparison, referring to the sixteen state-of-the-art methods. However, there is no explicit statement or link indicating that the authors' own method's code is open-source or available.
Open Datasets	Yes	We use six IML datasets for evaluation: CASIAv1 (Dong, Wang, and Tan 2013), Colombia (Hsu and Chang 2006), Coverage (Wen et al. 2016), NIST16 (Guan et al. 2019), CIMD (Zhang, Li, and Chang 2024) and Magic Brush (Zhang et al. 2024a).
Dataset Splits	No	The paper states, 'We use six IML datasets for evaluation,' and mentions using 'the uncompressed subset' for the CIMD dataset. It also notes that 'For methods trained on a dataset size of 12.5K, CASIAv2 (Dong, Wang, and Tan 2013) was used as the training set, while other methods used their own synthetic datasets.' However, specific training, validation, or test split percentages or sample counts for the evaluation of their proposed method are not provided.
Hardware Specification	Yes	The method is implemented using Pytorch (Paszke et al. 2019) on an A40 GPU.
Software Dependencies	No	The method is implemented using Pytorch (Paszke et al. 2019). However, specific version numbers for Pytorch or any other software libraries are not provided.
Experiment Setup	Yes	For our proposed components, we set the initial similarity scale to sd = 104, and the threshold for selecting the appropriate T is set to 0.2. In self-attention guidance, the guidance scale sf is set to 1.3, the attention threshold τ is 1.3, and the blur sigma is 3.