Squeezing Context into Patches: Towards Memory-Efficient Ultra-High Resolution Semantic Segmentation
Authors: Wang Liu, Puhong Duan, Xudong Kang, Shutao Li
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of our proposed method on four widely used UHR segmentation benchmarks. Experimental results demonstrate that our approach enhances UHR segmentation accuracy without incurring additional memory overhead during the inference stage. |
| Researcher Affiliation | Academia | Wang Liu1 , Puhong Duan2 , Xudong Kang2 and Shutao Li1,2 1College of Electrical and Information Engineering, Hunan University, China 2School of Robotics, Hunan University, China EMAIL, puhong EMAIL, xudong EMAIL, shutao EMAIL |
| Pseudocode | No | The paper describes the methodology using natural language, mathematical equations, and figures illustrating the architecture and concepts (e.g., Figure 2, Figure 3), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/Stu Liu/SCPSeg. |
| Open Datasets | Yes | We conduct experiments on four UHRSS datasets to comprehensively evaluate the effectiveness of our proposed method. ISPRS Potsdam. This dataset is a land cover mapping dataset collected in an urban area. BLU. This dataset is collected in urban and rural areas. Deep Globe. This is a land cover mapping dataset collected in both urban and rural areas. Inria Aerial. Inria Aerial is a building extraction dataset collected in urban areas. |
| Dataset Splits | Yes | ISPRS Potsdam. The train, val, and test sets are split to 18, 6, and 14 tiles following [Zhang et al., 2024]. BLU. It is split into 192, 28, and 32 tiles for training, validating, and testing, respectively. Deep Globe. We split it into 455, 142, and 206 subsets for training, validating, and testing following previous work [Chen et al., 2019]. Inria Aerial. We split it into 126, 27, and 27 subsets for training, validating, and testing following previous work [Chen et al., 2019]. |
| Hardware Specification | Yes | All the experiments are conducted in a single Nvidia RTX4090. |
| Software Dependencies | No | The paper mentions using Deeplabv3+ with a Res Net-18-d8 as the basic segmenter and SGD for optimization, and pre-training on Image Net1K, but it does not specify version numbers for general software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The learning rate is initially set to 0.01 and decayed by a cosine learning rate policy after each iteration. The training iteration number is set to 40000. The slide window size k in LFA is set to 7. For the ISPRS Potsdam, BLU, and Inria Aerial datasets, we set G = 512, L = 256, and D = 192, respectively. The training batch size is set to 16. For Deep Globe datasets, we set G = 1024, L = 512, and D = 384, respectively. The training batch size is set to 8. |