reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Detecting Backdoor Samples in Contrastive Language Image Pretraining

Authors: Hanxun Huang, Sarah Erfani, Yige Li, Xingjun Ma, James Bailey

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments also reveal that an unintentional backdoor already exists in the original CC3M dataset and has been trained into a popular open-source model released by Open CLIP. Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs. The code is publicly available in our Git Hub repository.
Researcher Affiliation	Academia	1School of Computing and Information Systems, The University of Melbourne, Australia 2School of Computing and Information Systems, Singapore Management University, Singapore 3School of Computer Science, Fudan University, China EMAIL; {yigeli}@smu.edu.sg;{xingjunma}@fudan.edu.cn.
Pseudocode	Yes	We present the pseudo-code in Appendix A. Algorithm 1 Backdoor Sample Detection With Local Outlier Methods
Open Source Code	Yes	The code is publicly available in our Git Hub repository. ... Open-source code is available here*.
Open Datasets	Yes	We conduct our experiment on the CC3M dataset (Sharma et al., 2018). The evaluation is conducted on Image Net (Deng et al., 2009) with the zero-shot classifier (Radford et al., 2021). ... We provide evaluations with a larger dataset, the CC12M (Changpinyo et al., 2021), in Appendix B.7, which shows a consistent performance for local outlier methods.
Dataset Splits	Yes	Here, we experiment to remove 10% of the data from a poisoned CC3M dataset according to the detection score using DAO. ... In Figure 3a, we present the sensitivity of the defense performance to different filtering percentages. For the patch trigger, removing 1% is sufficient to mitigate the backdoor threat. Additional results for other triggers are in Appendix B.3. They show that removing 5% is sufficient to mitigate the backdoor threat of different kinds of triggers.
Hardware Specification	Yes	Based on our detector, one can clean up a million-scale web dataset (e.g., CC3M) efficiently within 15 minutes using 4 Nvidia A100 GPUs. ... We conducted our experiments on Nvidia A100GPUs with Py Torch implementation.
Software Dependencies	No	We conducted our experiments on Nvidia A100GPUs with Py Torch implementation. ... We use Adam W optimizer (Loshchilov & Hutter, 2019), weight decay is set to 0.2, batch size of 1024 and train for 30 epochs. We use Res Net-50 (He et al., 2016) and Vi T-B-16 (Dosovitskiy et al., 2021) for the image encoder and transformer (Vaswani et al., 2017) for the text encoder. ... For Adam (Kingma & Ba, 2014) as the optimizer for m and , the learning rate is set to 0.05, β1 and β2 are set to 0.1.
Experiment Setup	Yes	We use a learning rate of 0.001, with Adam W optimizer (Loshchilov & Hutter, 2019), weight decay is set to 0.2, batch size of 1024 and train for 30 epochs. ... For CD, we find that in SSL, backdoor data has a higher L1 norm of the mask instead of lower in a supervised setting. As a result, we use a higher L1 norm of the mask to indicate it is more likely to be a backdoor data. We use 100 optimization steps, α set to 0.001 and β set to 100 for CD. For ABL, we use the average loss value for the first 10 epochs for each sample. For i Forest (Liu et al., 2008), we set the number of trees in the ensemble to 100. For LID, the k is set to 16. For k-dist, SLOF and DAO, the k is set to 16. We use batch size of 2048 for running the detection algorithm.