reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Backdooring Vision-Language Models with Out-Of-Distribution Data

Authors: Weimin Lyu, Michael Yao, Saumya Gupta, Lu Pang, Tao Sun, Lingjie Yi, Lijie Hu, Haibin Ling, Chao Chen

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation on image captioning and visual question answering (VQA) tasks confirms the effectiveness of VLOOD, revealing a critical security vulnerability in VLMs and laying the foundation for future research on securing multimodal models against sophisticated threats. Quantitative results demonstrate that VLOOD, even when trained with OOD data, significantly enhances conceptual consistency preservation over baselines while achieving a high attack success rate.
Researcher Affiliation	Academia	Weimin Lyu1, Jiachen Yao1, Saumya Gupta1, Lu Pang1, Tao Sun1, Lingjie Yi1, Lijie Hu2 Haibin Ling1, Chao Chen1 1 Stony Brook University, 2 King Abdullah University of Science and Technology
Pseudocode	No	The paper describes the methodology using text and mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The codebase is built upon Backdoor Bench (https://github.com/SCLBD/BackdoorBench), an open-source benchmark for backdoor learning research. This refers to a third-party codebase used by the authors, not their own source code for the methodology described in the paper.
Open Datasets	Yes	We evaluate the image captioning task on the Flickr8k (Hodosh et al., 2013), Flickr30k (Young et al., 2014), and COCO (Lin et al., 2014) datasets, and the VQA task on the OK-VQA (Marino et al., 2019) and VQAv2 (Goyal et al., 2017) datasets.
Dataset Splits	Yes	We utilize only 3000 samples of clean data D = {(I, T, O)}, and generate another 3000 poisoned data samples D = {( I, T, O)} from D. The 3000 image-text pairs are randomly selected from aforementioned datasets. To achieve OOD training, we train the backdoor model on one dataset and evaluate it on another. Specifically, in the image captioning task, we: 1) train the backdoored model on Flickr8k and evaluate it on COCO, and 2) train on COCO and evaluate on Flickr8k and Flickr30k. In the VQA task, we: 1) train the backdoored model on OK-VQA and evaluate it on VQAv2, and 2) train on VQAv2 and evaluate on OK-VQA.
Hardware Specification	Yes	The backdoored model is trained on an A6000 GPU with 48 GB of memory.
Software Dependencies	No	The codebase is built upon Backdoor Bench (https://github.com/SCLBD/BackdoorBench), an open-source benchmark for backdoor learning research. This mentions a codebase but does not provide specific version numbers for key software components or libraries.
Experiment Setup	No	The paper mentions that 'during fine-tuning, only the Q-Former adaptor is trained, while the image encoder and LLM remain frozen' and describes the dynamic adjustment mechanism for 'λ'. It also provides 'Converge Epoch' values for different 'λ' initializations in Table 11. However, specific values for common hyperparameters like learning rate, batch size, optimizer type, or other detailed training configurations are not explicitly provided in the main text.