reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Part-aware Prompted Segment Anything Model for Adaptive Segmentation

Authors: Chenhui Zhao, Liyue Shen

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	P2SAM improves the performance by +8.0% and +2.0% mean Dice score for two different patient-adaptive segmentation applications, respectively. In addition, P2SAM also exhibits impressive generalizability in other adaptive segmentation tasks in the natural image domain, e.g., +6.4% m Io U within personalized object segmentation task. The code is available at https://github.com/Zch0414/p2sam
Researcher Affiliation	Academia	Chenhui Zhao EMAIL Department of Computer Science and Engineering, University of Michigan Liyue Shen EMAIL Department of Electrical and Computer Engineering, University of Michigan
Pseudocode	No	The paper describes the methodology in prose and mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/Zch0414/p2sam
Open Datasets	Yes	We utilize a total of four medical datasets, including two internal datasets: The NSCLC-Radiomics dataset (Aerts et al., 2015), collected for non-small cell lung cancer (NSCLC) segmentation, contains data from 422 patients. Each patient has a computed tomography (CT) volume along with corresponding segmentation annotations. The Kvasir-SEG dataset (Jha et al., 2020), contains 1000 labeled endoscopy polyp images. Two external datasets from different institutions: The 4D-Lung dataset (Hugo et al., 2016), collected for longitudinal analysis, contains data from 20 patients, within which 13 patients underwent multiple visits, 3 to 8 visits for each patient. For each visit, a CT volume along with corresponding segmentation labels is available. The CVC-Clinic DB dataset (Bernal et al., 2015), contains 612 labeled polyp images selected from 29 endoscopy videos.
Dataset Splits	Yes	Each dataset was randomly split into three subsets: training, validation, and testing, with an 80:10:10 percent ratio (patient-wise splitting for the NSCLC-Radiomics dataset to prevent data leak).
Hardware Specification	Yes	All experiments are conducted on A40 GPUs.
Software Dependencies	No	The paper mentions using the Adam W optimizer and SAM as a backbone model but does not specify version numbers for any key software components like programming languages, libraries, or frameworks.
Experiment Setup	Yes	We fine-tune the model for 36 epochs on the NSCLC-Radiomics dataset and 100 epochs on the Kvasir-SEG dataset with a batch size of 4. The initial learning rate is 1e-4, and the fine-tuning process is guided by cosine learning rate decay, with a linear learning rate warm-up over the first 10 percent epochs. We optimize the model by Adam W optimizer (Loshchilov & Hutter, 2017) (β1=0.9, β2=0.999), with a weight decay of 0.05. We further penalize the SAM s encoder with a drop path of 0.1.