Stable Segment Anything Model

Authors: Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Di ZHANG, Pengfei Wan, Yu-Wing Tai, Chi-Keung Tang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments validate the effectiveness and advantages of our approach, underscoring Stable-SAM as a more robust solution for segmenting anything. Codes are at https://github.com/fanq15/Stable-SAM. (...) We evaluate the segmentation accuracy and stability of the Vi T-Large based SAM with different prompt types and qualities, including box prompts with added noise (...) The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (...) (validation set), Thin Object-5K (...) (test set), COIFT (...), and HR-SOD (...). Table 1 tabulates that SAM s segmentation accuracy and stability significantly decrease with low-quality prompts (...). We perform detailed analysis on Stable-SAM on its network modules, model scalability, low-shot generalization, point prompt quality, backbone variants, relation to other methods, and stability visualization.
Researcher Affiliation Collaboration 1 Nanjing University, 2 Kuaishou Technology, 3 Carnegie Mellon University, 4 EPFL, 5 Dartmouth College, 6 The Hong Kong University of Science and Technology
Pseudocode No The paper describes methods verbally and uses mathematical formulations (e.g., Equation 1, 2, 3, 4, 5, 6) but does not present any pseudocode or algorithm blocks.
Open Source Code Yes Codes are at https://github.com/fanq15/Stable-SAM.
Open Datasets Yes The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (Qin et al., 2022) (validation set), Thin Object-5K (Liew et al., 2021) (test set), COIFT (Mansilla & Miranda, 2019), and HR-SOD (Zeng et al., 2019). Furthermore, we validate the model s zero-shot generalization ability on three challenging segmentation benchmarks, including COCO (Lin et al., 2014), SGin W (Zou et al., 2023) and MESS (Blumenstiel et al., 2023).
Dataset Splits Yes The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (Qin et al., 2022) (validation set), Thin Object-5K (Liew et al., 2021) (test set), COIFT (Mansilla & Miranda, 2019), and HR-SOD (Zeng et al., 2019). (...) For every input image and prompt type, we randomly select 20 prompts to compute their segmentation stability (...). (...) we train all models on HQSeg-44K dataset, and evaluate their performance on four fine-grained segmentation datasets (...). (...) All models are trained with RTS by 1 training epoch, with 220/440 train images.
Hardware Specification No The paper does not mention any specific hardware used for training or inference, such as GPU models, CPU models, or memory specifications, other than general memory usage of the models (e.g., 7.6 G).
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes All our Stable-SAM models are trained by just one epoch for fast adaptation unless otherwise stated. All other models are trained 12 epochs. (...) To address inaccurate prompts, our RTS incorporates prompts of varying qualities during training. These prompts include groundtruth boxes, box prompts with added noise (noise scale 0.4), and point prompts with varying numbers of points (1, 3, 10 positive points randomly chosen from the ground truth mask).