reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Training-Free Dataset Pruning for Instance Segmentation

Authors: Yalun Dai, Lingao Xiao, Ivor Tsang, Yang He

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We achieve state-of-the-art results on VOC 2012, Cityscapes, and COCO datasets, generalizing well across CNN and Transformer architectures. Remarkably, our approach accelerates the pruning process by an average of 1349 on COCO compared to the adapted baselines. ... In our experiments, we conduct extensive comparisons between our implemented baselines and our model-independent TFDP, including performance comparisons, generalizability, and time consumption...
Researcher Affiliation	Academia	Yalun Dai1,2,3, Lingao Xiao1,2,4, Ivor W. Tsang1,2,3, Yang He1,2,4 1CFAR, Agency for Science, Technology and Research, Singapore 2IHPC, Agency for Science, Technology and Research, Singapore 3Nanyang Technological University, 4National University of Singapore EMAIL, xiao EMAIL EMAIL
Pseudocode	Yes	A TRAINING-FREE DATASET PRUNING (TFDP) ALGORITHM Algorithm 1 illustrates the code implementation process of our TFDP. Algorithm 1 Training-Free Dataset Pruning (TFDP)
Open Source Code	Yes	Source code is available at: https://github.com/he-y/dataset-pruning-for-instance-segmentation.
Open Datasets	Yes	To evaluate the proposed method TFDP, we conduct instance segmentation experiments on three mainstream datasets VOC 2012 (Everingham et al., 2010), Cityscapes (Cordts et al., 2016), and MS COCO (Lin et al., 2014).
Dataset Splits	Yes	MS COCO is a popular instance segmentation dataset, which contains an 80-category label set with instance-level annotations. Following previous works (He et al., 2017; Wang et al., 2020b), we use the COCO train 2017 (118K training images) for training, and the ablation study is carried out on the val 2017 (5K validation images).
Hardware Specification	Yes	For the VOC experiment, we used a single NVIDIA 3090 GPU. For the Cityscapes experiment, we used two NVIDIA 3090 GPUs. For the COCO experiment, we used two NVIDIA A100 80G GPUs. ... To ensure a fair comparison, all time consumption experiments were conducted on the same machine: Py Torch on Ubuntu 20.04, with NVIDIA RTX 3090 GPUs and CUDA 11.3. We used two NVIDIA 3090 GPUs for training and one NVIDIA 3090 GPU for inference.
Software Dependencies	Yes	To ensure a fair comparison, all time consumption experiments were conducted on a same machine: Py Torch on Ubuntu 20.04, with NVIDIA RTX 3090 GPUs and CUDA 11.3. We used two NVIDIA 3090 GPUs for training and one NVIDIA 3090 GPU for inference.
Experiment Setup	Yes	For hyperparameters, we follow the settings described in the original paper with details provided in the Appendix C.3. ... We use Res Net-50 as the backbone for Mask R-CNN and FPN to extract multi-scale features. All models and training hyperparameters were trained using the default hyperparameters as specified in MMDetection (Chen et al., 2019a).