Progressive Parameter Efficient Transfer Learning for Semantic Segmentation

Authors: Nan Zhou, Huiqun Wang, Yaoyan Zheng, Di Huang

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a comprehensive evaluation of the proposed Pro PETL for segmentation adaptation on benchmarks, including PASCAL VOC2012 (Everingham et al., 2015), ADE20k (Zhou et al., 2019), COCO-Stuff10k (Caesar et al., 2018), and City Scapes (Cordts et al., 2016). The mean intersection over union (m Io U) and mean accuracy (m Acc) on the validation set across all benchmarks are used as metrics to provide a comprehensive comparison with state-of-the-art methods. All experiments are implemented using MMSegmentation (Contributors, 2020) on NVIDIA A800 GPU.
Researcher Affiliation Academia Nan Zhou1,2 , Huiqun Wang1,2 , Yaoyan Zheng1,2, Di Huang1,2 1State Key Laboratory of Complex and Critical Software Environment, Beihang University, Beijing, China 2School of Computer Science and Engineering, Beihang University, Beijing, China EMAIL
Pseudocode No The paper describes methods using mathematical formulas and conceptual descriptions in sections like '3.1 VANILLA PETL', '3.2 MIDSTREAM PROGRESSIVE ADAPTATION', and '3.3 INTERMEDIATE TASK DESIGNATION'. However, it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code.
Open Source Code Yes Code is available at: https://github.com/weeknan/Pro PETL.
Open Datasets Yes Datasets. We conduct a comprehensive evaluation of the proposed Pro PETL for segmentation adaptation on benchmarks, including PASCAL VOC2012 (Everingham et al., 2015), ADE20k (Zhou et al., 2019), COCO-Stuff10k (Caesar et al., 2018), and City Scapes (Cordts et al., 2016).
Dataset Splits Yes 1) PASCAL VOC2012 consists of 1,464 training images and 1,449 validation images spanning 21 categories. 2) ADE20k includes 20,210 training images and 2,000 validation images, making it one of the most comprehensive datasets for semantic segmentation with a wide range of data variances. 3) COCO-Stuff10k extends the COCO dataset (Lin et al., 2014) with dense annotations for semantic segmentation, including 9,000 training images and 1,000 validation images across 172 categories. 4) City Scapes includes finely annotated images across 19 categories, with 2,975 for training and 500 for validation.
Hardware Specification Yes All experiments are implemented using MMSegmentation (Contributors, 2020) on NVIDIA A800 GPU.
Software Dependencies No The paper mentions 'MMSegmentation (Contributors, 2020)' and 'Adam W (Loshchilov & Hutter, 2018)' for optimization, but it does not specify any version numbers for MMSegmentation or other libraries and programming languages used.
Experiment Setup Yes Implementation details. We employ Adam W (Loshchilov & Hutter, 2018) for optimization, with β1 and β2 set to 0.9 and 0.999, respectively. A Poly LR scheduler is used to dynamically adjust the learning rate during the training process. The batch size is set to 16 across all benchmarks, and the iterations for downstream fine-tuning range from 40k to 160k, consistent with those in previous works (Xiao et al., 2018; Liu et al., 2021) for a fair comparison. For midstream adaptation, the iterations are set to half of the corresponding downstream settings.