MaskTwins: Dual-form Complementary Masking for Domain-Adaptive Image Segmentation

Authors: Jiawen Wang, Yinda Chen, Xiaoyu Liu, Che Liu, Dong Liu, Jianqing Gao, Zhiwei Xiong

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments verify the superiority of Mask Twins over baseline methods in natural and biological image segmentation. These results demonstrate the significant advantages of Mask Twins in extracting domain-invariant features without the need for separate pre-training, offering a new paradigm for domain-adaptive segmentation.
Researcher Affiliation Collaboration 1University of Science and Technology of China 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 3Data Science Institute, Imperial College London 4i FLYTEK CO., LTD. Correspondence to: Zhiwei Xiong <EMAIL>, Jianqing Gao <EMAIL>.
Pseudocode Yes We provide the overall training procedure of Mask Twins for image segmentation in Algorithm 1.
Open Source Code Yes The source code is available at https://github.com/jwwang0421/masktwins.
Open Datasets Yes To demonstrate the versatility of Mask Twins, we conduct experiments spanning six distinct datasets: SYNTHIA (Ros et al., 2016) and Cityscapes (Cordts et al., 2016) are natural datasets, VNC III (Gerhard et al., 2013), Lucchi (Lucchi et al., 2013), Mito EM (Wei et al., 2020) and WASPSYN (Li et al., 2024) are biological datasets.
Dataset Splits Yes Cityscapes consists of 2,975 training and 500 validation real-world images. The training subset (Subset1) and the test subset (Subset2) of Lucchi each contain 165 images, with a resolution of 1024 768 pixels.
Hardware Specification Yes The experiments are conducted on 8 RTX 3090 GPU.
Software Dependencies No The paper mentions using Adam W and Adam optimizers, and refers to specific data augmentation techniques and architectures (e.g., Mi T-B5 encoder, U-Net, Res UNet), but does not provide specific version numbers for any software libraries or programming languages used (e.g., PyTorch version, Python version, CUDA version).
Experiment Setup Yes For SYNTHIA Cityscapes, we use a patch size b = 64, a loss weight λcm = 0.01...Adam W (Loshchilov, 2017) with a learning rate of 6 10 5 for the encoder and 6 10 4 for the decoder, 40k training iterations, a batch size of 2, linear learning rate warmup, a loss weight λst = 1, an EMA factor α = 0.999...