Free Lunch of Image-mask Alignment for Anomaly Image Generation and Segmentation

Authors: Xiangyue Li, Xiaoyang Wang, Zhibin Wan, Quan Zhang, Yupei Wu, Tao Deng, Mingjie Sun

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show our method s Io U metrics exceed previous methods by 5.03%, 5.68% and 16.63% on Real-IAD (industrial), polyp (medical) and Floor Dirty (indoor) datasets.
Researcher Affiliation Collaboration 1School of Computer Science & Technology, Soochow University 2Xi an Jiaotong-Liverpool University 3Aqrose Technology
Pseudocode No The paper describes the methods through textual explanations and figures, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is publicly accessible at https://github.com/huanyin/anomaly-alignment.
Open Datasets Yes The evaluation experiments are conducted on multiple scenarios, including medical polyp dataset (ETIS [Silva et al., 2014], CVC-Clinic DB/CVC-612 [Bernal, 2015], CVCColon DB [Tajbakhsh et al., 2015], En-do Scene [V azquez et al., 2017], Kvasir [Jha et al., 2020]), industrial dataset (Real-IAD [Wang et al., 2024], MVTec-AD[Bergmann et al., 2019]), and Floor Dirty dataset.
Dataset Splits Yes train-real(1450) 93.7 88.9 78.7 70.6 90.0 83.3 91.7 86.4 80.8 72.7 86.98 80.38... The Real-IAD dataset, split into easy (10 objects) and hard (20 objects) based on generation difficulty, reveals that segformer trained with our synthetic samples achieves 1.07% and 5.03% higher average m Io U on Mv Tec-AD and Real-IAD, respectively, than segformer trained on the samples generated by Anomaly Diffusion.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No Using Lo RA [Hu et al., 2022] to add image conditions for Stable Diffusion is adopted as the baseline generative model. Segformer [Xie et al., 2021] is adopted as the baseline segmentation model. Adam W is adopted as the optimizer.
Experiment Setup Yes Adam W is adopted as the optimizer. Input and output images are constrained to 512 512. The learning rate is set as 10 5. The batch size is set as 4. For the inference of the diffusion model, the classifier free guidance scale is set as 7. We set the factor α for Lal to 0.7.