HSRDiff: A Hierarchical Self-Regulation Diffusion Model for Stochastic Semantic Segmentation

Authors: Han Yang, Chuanguang Yang, Zhulin An, Libo Huang, Yongjun Xu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate HSRDiff in three different semantic scenarios. Experimental results show that HSRDiff is superior to the comparison method with a considerable performance gap.
Researcher Affiliation Academia Han Yang1,2, Chuanguang Yang1*, Zhulin An1*, Libo Huang1, Yongjun Xu1 1Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China EMAIL
Pseudocode Yes Algorithm 1: HSRDiff Training Procedure; Algorithm 2: HSRDiff Inference Procedure
Open Source Code Yes Code https://github.com/yanghan-yh/HSRDiff.git
Open Datasets Yes Datasets Lung Nodule Segmentation (LIDC-IDRI) The LIDCIDRI dataset is a typical abnormality dataset... Multiple Sclerosis Lesion Segmentation (MS-Lesion) The dataset includes 84 longitudinal MRI scans from 5 subjects (Carass et al. 2017)... Multimodal Semantic Segmentation (Cityscapes) Cityscapes is a multi-class semantic segmentation dataset.
Dataset Splits Yes For the LIDC-IDRI, ...Finally, the training set consists of 1620 images, and the test set consists of 406 images. Multiple Sclerosis Lesion Segmentation (MS-Lesion) ...The training set contains 2300 slices, and the test set includes 531 slices. Multimodal Semantic Segmentation (Cityscapes) ...The official training dataset consists of 2975 images, and the validation dataset contains 500 images.
Hardware Specification No The paper does not explicitly describe the hardware used for experiments. It mentions implementation details but no specific GPU/CPU models or other hardware specifications.
Software Dependencies No HSRDiff is implemented using Pytorch. The paper mentions PyTorch but does not specify a version number or list other software dependencies with their versions.
Experiment Setup Yes For LIDC-IDRI, we crop images to 128 128 resolution and train for 500 epochs with a batch size of 40. MSLesion dataset uses the same settings as LIDC-IDRI but we resize the slice to 128 128. For Cityscapes, our two experimental groups followed the settings of (Kohl et al. 2018) and (Zbinden et al. 2023) respectively, training for 800 epochs with a batch size of 4. Across all experiments, we use 250 time steps with a linear noise schedule and the Adam W optimizer with a learning rate of 10 4. Both λ1 and λ2 are set to 1.