reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

Authors: Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li, Yunhe Wang, Xinghao Chen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our Tiny SAM against counterpart methods. Experiments Implementation Details We utilize the Tiny Vi T-5M (Wu et al. 2022) as the lightweight student image encoder and SAM-H as the teacher model, following prior work (Zhang et al. 2023). 1% of SA-1B dataset is used as the training data for fullstage distillation.
Researcher Affiliation	Collaboration	1University of Science and Technology of China 2Huawei Noah s Ark Lab EMAIL
Pseudocode	No	The paper describes methods using equations and figures, but does not contain a clearly labeled pseudocode or algorithm block with structured steps.
Open Source Code	Yes	Code https://github.com/xinghaochen/Tiny SAM
Open Datasets	Yes	Together with the proposed SA-1B dataset, which contains 11 million high-resolution images and more than 1 billion high-quality segmentation masks, SAM shows impressive high quality segmentation ability for objects of any category and shape. We evaluate the zero-shot instance segmentation task for models on the benchmark of COCO (Lin et al. 2014) dataset and LVIS v1 (Gupta, Dollar, and Girshick 2019). We choose a subset of total 23 datasets used in (Kirillov et al. 2023) for efficient evaluation, which contains BBBC038v1 (Caicedo et al. 2019), DOORS (Pugliatti and Topputo 2022), Timber Seg (Fortin et al. 2022) and LVIS (Gupta, Dollar, and Girshick 2019).
Dataset Splits	Yes	1% of SA-1B dataset is used as the training data for fullstage distillation. We evaluate the zero-shot instance segmentation task for models on the benchmark of COCO (Lin et al. 2014) dataset and LVIS v1 (Gupta, Dollar, and Girshick 2019). To make fair comparison, we follow the settings of SAM (Kirillov et al. 2023) to sample the images and masks, and the first N masks in the corresponding split are used in the evaluation. Evaluation on the first 100 images of COCO val2017 set.
Hardware Specification	Yes	The latency is tested with Tensor RT on NVIDIA T4 GPU. The latency is tested on NVIDIA T4 GPU. Latency benchmarks are conducted on a single NVIDIA V100 GPU for everything mode.
Software Dependencies	No	The paper mentions 'Tensor RT' but does not specify a version number. Other software like 'Adam optimizer' is mentioned but without version details for the software library or framework.
Experiment Setup	Yes	We utilize the Tiny Vi T-5M (Wu et al. 2022) as the lightweight student image encoder and SAM-H as the teacher model, following prior work (Zhang et al. 2023). 1% of SA-1B dataset is used as the training data for fullstage distillation. We adopt Adam optimizer and train the student network for 8 epochs. For each iteration, we sample 64 prompts according to hard prompt sampling strategy. For post training quantization, we set θl = 0.01, θu = 1.2, n = 100, rounds = 3 for iterative search. We calibrate quantized model on SA-1B dataset using 8 images. The threshold values used in the everything mode are all kept the same as default.