Stable Segment Anything Model
Authors: Qi Fan, Xin Tao, Lei Ke, Mingqiao Ye, Di ZHANG, Pengfei Wan, Yu-Wing Tai, Chi-Keung Tang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments validate the effectiveness and advantages of our approach, underscoring Stable-SAM as a more robust solution for segmenting anything. Codes are at https://github.com/fanq15/Stable-SAM. (...) We evaluate the segmentation accuracy and stability of the Vi T-Large based SAM with different prompt types and qualities, including box prompts with added noise (...) The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (...) (validation set), Thin Object-5K (...) (test set), COIFT (...), and HR-SOD (...). Table 1 tabulates that SAM s segmentation accuracy and stability significantly decrease with low-quality prompts (...). We perform detailed analysis on Stable-SAM on its network modules, model scalability, low-shot generalization, point prompt quality, backbone variants, relation to other methods, and stability visualization. |
| Researcher Affiliation | Collaboration | 1 Nanjing University, 2 Kuaishou Technology, 3 Carnegie Mellon University, 4 EPFL, 5 Dartmouth College, 6 The Hong Kong University of Science and Technology |
| Pseudocode | No | The paper describes methods verbally and uses mathematical formulations (e.g., Equation 1, 2, 3, 4, 5, 6) but does not present any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Codes are at https://github.com/fanq15/Stable-SAM. |
| Open Datasets | Yes | The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (Qin et al., 2022) (validation set), Thin Object-5K (Liew et al., 2021) (test set), COIFT (Mansilla & Miranda, 2019), and HR-SOD (Zeng et al., 2019). Furthermore, we validate the model s zero-shot generalization ability on three challenging segmentation benchmarks, including COCO (Lin et al., 2014), SGin W (Zou et al., 2023) and MESS (Blumenstiel et al., 2023). |
| Dataset Splits | Yes | The evaluation utilizes four segmentation datasets as in HQ-SAM: DIS (Qin et al., 2022) (validation set), Thin Object-5K (Liew et al., 2021) (test set), COIFT (Mansilla & Miranda, 2019), and HR-SOD (Zeng et al., 2019). (...) For every input image and prompt type, we randomly select 20 prompts to compute their segmentation stability (...). (...) we train all models on HQSeg-44K dataset, and evaluate their performance on four fine-grained segmentation datasets (...). (...) All models are trained with RTS by 1 training epoch, with 220/440 train images. |
| Hardware Specification | No | The paper does not mention any specific hardware used for training or inference, such as GPU models, CPU models, or memory specifications, other than general memory usage of the models (e.g., 7.6 G). |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | All our Stable-SAM models are trained by just one epoch for fast adaptation unless otherwise stated. All other models are trained 12 epochs. (...) To address inaccurate prompts, our RTS incorporates prompts of varying qualities during training. These prompts include groundtruth boxes, box prompts with added noise (noise scale 0.4), and point prompts with varying numbers of points (1, 3, 10 positive points randomly chosen from the ground truth mask). |