reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Enabling Users to Falsify Deepfake Attacks

Authors: Tal Reiss, Bar Cavia, Yedid Hoshen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate that FACTOR significantly outperforms state-of-the-art deepfake detection techniques despite being simple to implement and not relying on any fake data for pretraining. Our code is available at https://github.com/talreiss/FACTOR.We conducted experiments on three face swapping datasets that provide identity-related information: Celeb-DF (Li et al., 2020), DFD (Research et al.) (which is part of FF++ (Rössler et al., 2019)), and DFDC (Dolhansky et al., 2019). The results, presented in Tab. 1, show that while the supervised baselines performed poorly on previously unseen attack scenarios, our method achieved near-perfect accuracy.
Researcher Affiliation	Academia	Tal Reiss EMAIL School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel Bar Cavia EMAIL School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel Yedid Hoshen EMAIL School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel
Pseudocode	No	The paper only describes steps in regular paragraph text without structured formatting. It provides mathematical formulas (Eq. 1, 2, 3) and descriptive text for the FACTOR method, but no explicit pseudocode or algorithm block.
Open Source Code	Yes	Our code is available at https://github.com/talreiss/FACTOR.
Open Datasets	Yes	We conducted experiments on three face swapping datasets that provide identity-related information: Celeb-DF (Li et al., 2020), DFD (Research et al.) (which is part of FF++ (Rössler et al., 2019)), and DFDC (Dolhansky et al., 2019). We evaluated our method on the Fake AVCeleb video forensics dataset (Khalid et al., 2021). Our evaluation is performed on a random sample of 1000 images from COCO (Lin et al., 2014).
Dataset Splits	Yes	To ensure uniformity in our experiments, we uniformly subsampled each video (train or test and for all datasets) into 32 frames. Furthermore, we split each identity s authentic videos into a 50/50 train-test split.
Hardware Specification	Yes	For the audio-visual verification component, processing a whole video clip takes approximately 3 seconds on a single NVIDIA RTX 2080 GPU.
Software Dependencies	Yes	Specifically, for CLIP s architecture, we leveraged the Vi T-B/16 pretrained on the LAION-2B dataset, following OPENCLIP specifications (Ilharco et al., 2021). Furthermore, the checkpoint version of Stable Diffusion we used was v1-5 (Stable Diffusion.).
Experiment Setup	Yes	To ensure uniformity in our experiments, we uniformly subsampled each video (train or test and for all datasets) into 32 frames. Furthermore, we split each identity s authentic videos into a 50/50 train-test split. We opt for a simple but effective solution, using the truth score with the λ% lowest value in the video (we choose λ = 3%). Specifically, for CLIP s architecture, we leveraged the Vi T-B/16 pretrained on the LAION-2B dataset, following OPENCLIP specifications (Ilharco et al., 2021).