Robust SAM: On the Adversarial Robustness of Vision Foundation Models
Authors: Jiahuan Long, Zhengqin Xu, Tingsong Jiang, Wen Yao, Shuai Jia, Chao Ma, Xiaoqian Chen
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our cross-prompt attack method outperforms previous approaches in terms of attack success rate on both SAM and SAM 2. By adapting only 512 parameters, we achieve at least a 15% improvement in mean intersection over union (m Io U) against various adversarial attacks. Compared to previous defense methods, our approach enhances the robustness of SAM while maximally maintaining its original performance. |
| Researcher Affiliation | Academia | 1Mo E Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, 2Defense Innovation Institute, Chinese Academy of Military Science, 3Intelligent Game and Decision Laboratory, 4Chinese Academy of Military Science, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodologies using text and mathematical formulations (e.g., equations for m Io U, Ladv, Ldef) and figures, but does not include a distinct 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is released or provide a link to a code repository. |
| Open Datasets | Yes | To evaluate the robustness of SAM under different types of prompts (i.e., point and box prompts), we randomly sample 2000 images from the SA1B (Kirillov et al. 2023), VOC (Everingham et al. 2010), COCO (Lin et al. 2014), and DAVIS (Pont-Tuset et al. 2017) datasets. |
| Dataset Splits | Yes | The VOC dataset is split into 70% for training and 30 for evaluation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer (Kingma and Ba 2014) but does not specify version numbers for other key software components, libraries, or programming languages. |
| Experiment Setup | Yes | For the attack setting, we set the total number of iteration steps to 20, and perturbation intensity ϵ to 16/255 for PPA, BPA, and our cross-prompt attack. The attack feature A are set to the negative values of the key features, and K in TOPK function is set to 5. For all fewparameter adaptation methods, we randomly sample 70% of the adversarial examples in the VOC dataset for adapting SAM. Our training employs the Adam optimizer (Kingma and Ba 2014). The initial learning rate is set to 1.0 10 3, and the weight decay is 5 10 5 with one image per minibatch. The number of training epochs is set to 500. |