Robust and Consistent Online Video Instance Segmentation via Instance Mask Propagation
Authors: Miran Heo, Seoung Wug Oh, Seon Joo Kim, Joon-Young Lee
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments Datasets You Tube-VIS. You Tube-VIS (Yang, Fan, and Xu 2019) is a standard benchmark dataset for VIS in three versions (2019/2021/2022). Each version is designed to segment objects from 40 predefined categories within videos. ... Quantitative Results You Tube-VIS 2019 & 2021. Due to space constraints, detailed results are provided in the supplementary material. ... Ablation Study In this section, we analyze the main components of Ro Co VIS and evaluate their impact in Tab. 2. |
| Researcher Affiliation | Collaboration | Miran Heo1*, Seoung Wug Oh2, Seon Joo Kim1, Joon-Young Lee2 1Yonsei University 2Adobe Research |
| Pseudocode | No | The paper describes its methodology through textual explanations and mathematical equations (e.g., Eq. 1-4) within the 'Method' section, but it does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured code-like procedures. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the methodology described, nor does it provide any links to a code repository. |
| Open Datasets | Yes | Datasets You Tube-VIS. You Tube-VIS (Yang, Fan, and Xu 2019) is a standard benchmark dataset for VIS in three versions (2019/2021/2022). ... OVIS. Occluded-VIS (OVIS) dataset (Qi et al. 2021) has been introduced to specifically address the challenging scenario of heavy occlusions between objects. ... HQ-YTVIS. ... HQ-YTVIS (Ke et al. 2022b) refines mask annotation of You Tube-VIS 2019 ... VIPSeg. We utilize VIPSeg (Miao et al. 2022), introduced for Video Panoptic Segmentation (VPS) (Kim et al. 2020). |
| Dataset Splits | Yes | You Tube-VIS 2022. Tab. 1 showcases our performance on the challenging benchmark, the long video validation split of You Tube-VIS 2022. ... OVIS. We also present our results on the OVIS dataset in Tab. 1, which features highly occluded instances across long videos. ... HQ-YTVIS & VIPSeg-things. We demonstrate that Ro Co VIS produces high-quality consistent mask outputs in Tab. 3. ... VIPSeg. We follow the original data split, while simply converting the things annotations into VIS annotations. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using backbones such as Swin-L and Res Net-50 and frameworks like Transformer-based architectures, but it does not specify any software libraries, frameworks, or programming languages with their version numbers. |
| Experiment Setup | No | The paper describes conceptual aspects of training and inference, including modifications to the UVLA criterion, but it does not provide specific experimental setup details such as hyperparameters (e.g., learning rate, batch size, epochs, optimizer settings) in the main text. |