reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation

Authors: Jian Hu, Zixu Cheng, Shaogang Gong

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of our INT is validated on six datasets, including camouflaged objects and medical images, showing its robustness and applicability. Experiments on six datasets demonstrate the effectiveness of our method. To assess performance in the first three tasks, we employ the following metrics: Mean Absolute Error (M), adaptive F-measure (Fβ), mean E-measure (Eϕ), and structure measure (Sα).
Researcher Affiliation	Academia	Jian Hu, Zixu Cheng, Shaogang Gong Queen Mary University of London EMAIL
Pseudocode	No	The paper describes the methodology using textual descriptions and mathematical equations, but it does not include any clearly labeled pseudocode blocks or algorithm sections.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We tested INT on three benchmark datasets: CHAMELEON [Skurowski et al., 2018], CAMO [Le et al., 2019], and COD10K [Fan et al., 2021a]. We utilized datasets such as CVC-Colon DB [Tajbakhsh et al., 2015] and Kvasir [Jha et al., 2020] for polyp image segmentation, and ISIC [Codella et al., 2019] for skin lesion segmentation.
Dataset Splits	Yes	The CAMO dataset contains 1,250 images, divided into 1,000 training images and 250 testing images. The COD10K dataset comprises 3,040 training samples and 2,026 testing samples in total.
Hardware Specification	Yes	Our experiments are conducted on a single NVIDIA A100 GPU, with further details provided in the appendix.
Software Dependencies	Yes	For the VLM models, we employ LLa VA-1.5-13B for evaluation. For image processing, we use the CS-Vi T-B/16 model pre-trained with CLIP, and for image inpainting module, we deploy stable-diffusion2-inpainting. Our experiments are conducted on a single NVIDIA A100 GPU, with further details provided in the appendix.
Experiment Setup	Yes	All tasks undergo training-free test-time adaptation, iterating for four epochs, except for the polyp image segmentation task, which extends to six epochs. We utilize the Vi T-H/16 model for promptable segmentation methods. The task-generic prompts for the COD task are specified as camouflaged animal. The MIS task includes two sub-tasks: polyp image segmentation and skin lesion segmentation, each prompted by polyp and skin lesion respectively. where w is a hyperparameter set to 0.3.