Training-Free Dataset Pruning for Instance Segmentation

Authors: Yalun Dai, Lingao Xiao, Ivor Tsang, Yang He

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We achieve state-of-the-art results on VOC 2012, Cityscapes, and COCO datasets, generalizing well across CNN and Transformer architectures. Remarkably, our approach accelerates the pruning process by an average of 1349 on COCO compared to the adapted baselines. ... In our experiments, we conduct extensive comparisons between our implemented baselines and our model-independent TFDP, including performance comparisons, generalizability, and time consumption...
Researcher Affiliation Academia Yalun Dai1,2,3, Lingao Xiao1,2,4, Ivor W. Tsang1,2,3, Yang He1,2,4 1CFAR, Agency for Science, Technology and Research, Singapore 2IHPC, Agency for Science, Technology and Research, Singapore 3Nanyang Technological University, 4National University of Singapore EMAIL, xiao EMAIL EMAIL
Pseudocode Yes A TRAINING-FREE DATASET PRUNING (TFDP) ALGORITHM Algorithm 1 illustrates the code implementation process of our TFDP. Algorithm 1 Training-Free Dataset Pruning (TFDP)
Open Source Code Yes Source code is available at: https://github.com/he-y/dataset-pruning-for-instance-segmentation.
Open Datasets Yes To evaluate the proposed method TFDP, we conduct instance segmentation experiments on three mainstream datasets VOC 2012 (Everingham et al., 2010), Cityscapes (Cordts et al., 2016), and MS COCO (Lin et al., 2014).
Dataset Splits Yes MS COCO is a popular instance segmentation dataset, which contains an 80-category label set with instance-level annotations. Following previous works (He et al., 2017; Wang et al., 2020b), we use the COCO train 2017 (118K training images) for training, and the ablation study is carried out on the val 2017 (5K validation images).
Hardware Specification Yes For the VOC experiment, we used a single NVIDIA 3090 GPU. For the Cityscapes experiment, we used two NVIDIA 3090 GPUs. For the COCO experiment, we used two NVIDIA A100 80G GPUs. ... To ensure a fair comparison, all time consumption experiments were conducted on the same machine: Py Torch on Ubuntu 20.04, with NVIDIA RTX 3090 GPUs and CUDA 11.3. We used two NVIDIA 3090 GPUs for training and one NVIDIA 3090 GPU for inference.
Software Dependencies Yes To ensure a fair comparison, all time consumption experiments were conducted on a same machine: Py Torch on Ubuntu 20.04, with NVIDIA RTX 3090 GPUs and CUDA 11.3. We used two NVIDIA 3090 GPUs for training and one NVIDIA 3090 GPU for inference.
Experiment Setup Yes For hyperparameters, we follow the settings described in the original paper with details provided in the Appendix C.3. ... We use Res Net-50 as the backbone for Mask R-CNN and FPN to extract multi-scale features. All models and training hyperparameters were trained using the default hyperparameters as specified in MMDetection (Chen et al., 2019a).