Semi-Supervised Semantic Segmentation via Marginal Contextual Information

Authors: Moshe Kimhi, Shai Kimhi, Evgenii Zheltonozhskii, Or Litany, Chaim Baskin

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments on standard benchmarks, we demonstrate that S4MC outperforms existing state-of-the-art semi-supervised learning approaches, offering a promising solution for reducing the cost of acquiring dense annotations. For example, S4MC achieves a 1.39 m Io U improvement over the prior art on PASCAL VOC 12 with 366 annotated images. [...] 4 Experiments This section presents our experimental results. The setup for the different datasets and partition protocols is detailed in Section 4.1. Section 4.2 compares our method against existing approaches and Section 4.3 provides the ablation study.
Researcher Affiliation Collaboration Moshe Kimhi EMAIL Computer Science Department, Technion Shai Kimhi EMAIL Computer Science Department, Technion Evgenii Zheltonozhskii EMAIL Physics Department, Technion Or Litany EMAIL Computer Science Department, Technion NVIDIA Chaim Baskin EMAIL Computer Science Department, Technion
Pseudocode Yes Algorithm 1: Pseudocode: Pseudo label refinement of S4MC, Py Torch-like style.
Open Source Code Yes The code to reproduce our experiments is available at https://s4mcontext.github.io/.
Open Datasets Yes Datasets In our experiments, we use PASCAL VOC 12 (Everingham et al., 2010), Cityscapes (Cordts et al., 2016), and MS COCO (Lin et al., 2014) datasets.
Dataset Splits Yes Evaluation We compare S4MC with state-of-the-art methods and baselines under the standard partition protocols using 1/2, 1/4, 1/8, and 1/16 of the training set as labeled data. For the classic setting of the PASCAL experiment, we additionally use all the finely annotated images. We follow standard protocols and use mean Intersection over Union (m Io U) as our evaluation metric. We use the data split published by Wang et al. (2022) when available to ensure a fair comparison. For the ablation studies, we use PASCAL VOC 12 val with 1/4 partition.
Hardware Specification Yes To verify that, we conducted a training time analysis comparing Fix Match and Fix Match + S4MC over PASCAL with 366 labeled examples, using distributed training with 8 Nvidia RTX 3090 GPUs. [...] All experiments are conducted on a machine with 8 Nvidia RTX A5000 GPUs.
Software Dependencies No No specific version numbers for key software components like PyTorch or other libraries are provided. The text only mentions 'Py Torch-like pseudo-code' and the use of specific architectures (Deep Labv3+, Res Net-101, Xception-65) and optimizers (SGD) without versioning.
Experiment Setup Yes All experiments were conducted for 80 training epochs with the stochastic gradient descent (SGD) optimizer with a momentum of 0.9 and learning rate policy of lr = lrbase 1 iter total iter power. [...] For PASCAL VOC 12 lrbase = 0.001 and the decoder only lrbase = 0.01, the weight decay is set to 0.0001 and all images are cropped to 513 513 and Bl = Bu = 3. For Cityscapes, all parameters use lrbase = 0.01, and the weight decay is set to 0.0005. The learning rate decay parameter is set to power = 0.9. Due to memory constraints, all images are cropped to 769 769 and Bℓ= Bu = 2.