SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement

Authors: Yuqi Lin, Hengjia Li, Wenqi Shao, Zheng Yang, Jun Zhao, Xiaofei He, Ping Luo, Kaipeng Zhang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our mask framework on a wide range of benchmarks under different settings, demonstrating better accuracy and efficiency.
Researcher Affiliation Collaboration 1State Key Lab of CAD&CG, College of Computer Science, Zhejiang University 2Shanghai AI Laboratory 3 FABU Inc. 4 The University of Hong Kong
Pseudocode Yes We present the pseudo-code for the region merging strategy in Algorithm 1, which is an important component of our split-then-merge (STM) pipeline for semantic segmentation.
Open Source Code Yes Our code is available at SAMRefiner.
Open Datasets Yes For a comprehensive evaluation of the mask refinement performance of SAMRefiner, we conduct experiments on a wide range of benchmarks, including those designed for mask refinement (DAVIS-585Chen et al. (2022)), instance segmentation (COCOLin et al. (2014)), semantic segmentation (VOCEveringham et al. (2010)) under different settings.
Dataset Splits Yes To ensure a fair comparison, we maintain the same split of data subsets (e.g., 1%, 5%, 10%) as each baseline method. We assess pseudo labels quality by randomly sampling 5,000 images in the train set (denoted as train5K) that have no intersection with annotated data subsets.
Hardware Specification Yes The time cost reported in the paper is tested on a single 3090 GPU.
Software Dependencies No The paper mentions "We implement our method with Py Torch Paszke et al. (2019)." but does not specify a version number for PyTorch or any other software component.
Experiment Setup Yes The threshold λ and µ used in the box and mask prompt are set to 0.1 and 0.5 respectively. The factors ω, γ for Gaussian distribution are set to 15 and 4 by default. For Io U adaption step, we use SGD optimizer with 0.01 learning rate. The batch size is set to 5 and we only train for 1 epoch. The learning rate is reduced to one-tenth at steps 60 and 100. We use margin ranking loss with the margin as 0.02 and the Lo RA rank is set to 4.