Revisit the Open Nature of Open Vocabulary Semantic Segmentation
Authors: Qiming Huang, Han Hu, Jianbo Jiao
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental evaluations show that the proposed mask-wise protocol provides a more effective and reliable evaluation framework for OVS models compared to the previous pixel-wise approach on the perspective of open-world. Moreover, analysis of mismatched mask pairs reveals that a large amount of ambiguous categories exist in commonly used OVS datasets. Interestingly, we find that reducing these ambiguities during both training and inference enhances capabilities of OVS models. These findings and the new evaluation protocol encourage further exploration of the open nature of OVS, as well as broader open-world challenges. Project page: https://qiming-huang.github.io/Revisit OVS/. |
| Researcher Affiliation | Academia | Qiming Huang, Han Hu, Jianbo Jiao The MIx Group, School of Computer Science University of Birmingham {qxh366, hxh864}.EMAIL, EMAIL |
| Pseudocode | Yes | A.1 PSEUDOCODE OF THE MASK-WISE EVALUATION PROTOCOL Algorithm 1 Mask-Wise Evaluation Protocol |
| Open Source Code | No | Project page: https://qiming-huang.github.io/Revisit OVS/. The paper provides a project page URL, but it does not explicitly state that the code for the methodology is released or provide a direct link to a code repository within the text. |
| Open Datasets | Yes | Following previous OVS works (Cho et al., 2023; Xie et al., 2023; Xu et al., 2023), we train the models on the COCO-Stuff171 (Caesar et al., 2018) dataset with 171 categories and perform zero-shot evaluation on ADE20K (Zhou et al., 2019) and PASCAL-Context (Mottaghi et al., 2014) datasets. |
| Dataset Splits | Yes | Following previous OVS works (Cho et al., 2023; Xie et al., 2023; Xu et al., 2023), we train the models on the COCO-Stuff171 (Caesar et al., 2018) dataset with 171 categories and perform zero-shot evaluation on ADE20K (Zhou et al., 2019) and PASCAL-Context (Mottaghi et al., 2014) datasets. |
| Hardware Specification | Yes | The experiments were conducted on a NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions various models and frameworks (e.g., CLIP, FCNs, Transformer-based, LSeg, Open Seg, CAT-Seg, Mask CLIP, SED, MAFT+) but does not provide specific version numbers for any programming languages, libraries, or software environments used in the experiments. |
| Experiment Setup | Yes | The threshold ˆτ is set to 0.8. |