A Training-free Synthetic Data Selection Method for Semantic Segmentation

Authors: Hao Tang, Siyue Yu, Jian Pang, Bingfeng Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results show that using our method significantly reduces the data size by half, while the trained segmenter achieves higher performance. ... Experiments Dataset and Evaluation Metrics Implementation Details Performance Comparison Ablation Studies
Researcher Affiliation Academia Hao Tang1, Siyue Yu2, Jian Pang1, Bingfeng Zhang1* 1China University of Petroleum (East China) 2XJTLU EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes its methods in text and uses Figure 2 for an overview, but no structured pseudocode or algorithm blocks are provided.
Open Source Code Yes Code https://github.com/tanghao2000/SDS
Open Datasets Yes We evaluate our method on PASCAL VOC 2012 (Everingham et al. 2015) and MS COCO 2017 (Lin et al. 2014).
Dataset Splits Yes PASCAL VOC 2012 has 20 object classes and 1 background class, which is augmented by SBD (Hariharan et al. 2011) to obtain 10,584 training, 1,449 validation, and 1,456 test images. For the synthetic dataset, we follow Dataset Diffusion (Nguyen et al. 2024) to produce 40k image-annotation pairs as the initial dataset. ... MS COCO 2017 contains 80 object classes and 1 background class with 118k training and 5k validation images.
Hardware Specification Yes All experiments are running on NVIDIA RTX 3090 GPUs.
Software Dependencies No The paper mentions models like CLIP pre-trained model Vi T-B-16, Deep Labv3, Mask2Former, and CDL, but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Hyperparatemers: In Perturbation-based CLIP Similarity (PCS), Ns {8, 16, 32} represents the patch scale. No is the number of patch orders, we set No = 3. In Eq. 6, thresholds τs and τP CS are set to 0.8 and 0.1, respectively. In class-balance Annotation Similarity Filter (ASF), we set n to 0.6 the number of samples within groups in Eq. (15), i.e., top 60% samples are selected.