Learnable Frequency Decomposition for Image Forgery Detection and Localization

Authors: Dong Li, Jiayíng Zhu, Yidi Liu, Xin Lu, Xueyang Fu, Jiawei Liu, Aiping Liu, Zheng-Jun Zha

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on multiple datasets demonstrate that our method outperforms state-of-the-art image forgery detection and localization techniques both qualitatively and quantitatively. We conduct extensive experiments on multiple benchmarks and demonstrate that our method outperforms state-of-the-art methods both qualitatively and quantitatively.
Researcher Affiliation Academia University of Science and Technology of China EMAIL, EMAIL
Pseudocode No The paper describes the methodology using narrative text, equations, and diagrams, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not contain any explicit statement about releasing the code or a link to a code repository.
Open Datasets Yes Testing Datasets Following [Liu et al., 2022; Wang et al., 2022a], we evaluate our model on CASIA [Dong et al., 2013], Coverage [Wen et al., 2016], Columbia [Hsu and Chang, 2006], NIST16 [Guan et al., 2019] and IMD20 [Novozamsky et al., 2020].
Dataset Splits Yes We apply the same training/testing splits as [Hu et al., 2020; Wang et al., 2022a] to fine-tune our model for fair comparisons.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for experiments.
Software Dependencies No The paper mentions using FFT and ResNet-50 pretrained on ImageNet but does not specify version numbers for any software libraries (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes In practice, α is set as 0.60 and β is set as 0.2. We use Res Net-50 pretrained on Image Net [Deng et al., 2009] as the backbone network of the spectral decomposition subnetwork. Pre-training Data We create a sizable image tampering dataset and use it to pre-train our model. This dataset includes three categories: 1) splicing, 2) copy-move, and 3) removal.