Test-Time Multimodal Backdoor Detection by Contrastive Prompting
Authors: Yuwei Niu, Shuo He, Qi Wei, Zongyu Wu, Feng Liu, Lei Feng
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate that our proposed BDet CLIP is superior to state-of-the-art backdoor detection methods, in terms of both effectiveness and efficiency. (Abstract) and the entire Section 4 Experiments. |
| Researcher Affiliation | Academia | 1Chongqing University 2Nanyang Technological University 3Penn State University 4University of Melbourne 5Southeast University. Correspondence to: Lei Feng <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 BDet CLIP |
| Open Source Code | No | The official open-sourced codes for STRIP (Gao et al., 2019) can be found at: https://github.com/garrisongys/STRIP. ...The official open-sourced codes for SCALE-UP (Guo et al., 2023) can be found at: https://github.com/Junfeng Go/SCALE-UP. ...The official open-sourced codes for Te Co (Liu et al., 2023) can be found at: https://github.com/CGCL-codes/Te Co. (These are for comparison methods, not the paper's own work.) The paper does not explicitly state that its own code for BDet CLIP is released or provide a link. |
| Open Datasets | Yes | In the experiment, we evaluate BDet CLIP on various downstream classification datasets including Image Net-1K (Russakovsky et al., 2015), Food-101 (Bossard et al., 2014) and Caltech-101 (Fei-Fei et al., 2004). ... Besides, we select target backdoored samples from CC3M (Sharma et al., 2018) which is a popular multimodal pre-training dataset. |
| Dataset Splits | Yes | In our experiment, we utilized the validation set of Image Net-1K (Russakovsky et al., 2015), along with the test sets of Food-101 (Bossard et al., 2014) and Caltech101 (Fei-Fei et al., 2004). By using a fixed backdoor ratio (0.3) on different downstream datasets in the evaluation, there are 15,000 (out of 50,000) backdoored images on Image Net-1K, 7,575 (out of 25,250) backdoored images on Food-101, and 740 (out of 2,465) backdoored images on Caltech-101. |
| Hardware Specification | Yes | All experiments are conducted on 8 NVIDIA 3090 GPUs. |
| Software Dependencies | Yes | Specifically, we first prompt the GPT-4 (Achiam et al., 2023) to generate class-related (or class-perturbed random) description texts...Also, using open-source models (e.g., LLa MA3-8B (Dubey et al., 2024) and Mistral-7B-Instruct-v0.2 (Jiang et al., 2023)) as alternatives. |
| Experiment Setup | Yes | We finetune the pretrained model for 5 epochs with an initial learning rate of 1e-6 with cosine scheduling and 50 warmup steps and use Adam W as the optimizer. ... We trained for 64 epochs with a batch size of 128, an initial learning rate of 0.0005 for cosine scheduling, and 10000 warm-up steps for the Adam W optimizer. |