Free Lunch of Image-mask Alignment for Anomaly Image Generation and Segmentation
Authors: Xiangyue Li, Xiaoyang Wang, Zhibin Wan, Quan Zhang, Yupei Wu, Tao Deng, Mingjie Sun
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show our method s Io U metrics exceed previous methods by 5.03%, 5.68% and 16.63% on Real-IAD (industrial), polyp (medical) and Floor Dirty (indoor) datasets. |
| Researcher Affiliation | Collaboration | 1School of Computer Science & Technology, Soochow University 2Xi an Jiaotong-Liverpool University 3Aqrose Technology |
| Pseudocode | No | The paper describes the methods through textual explanations and figures, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is publicly accessible at https://github.com/huanyin/anomaly-alignment. |
| Open Datasets | Yes | The evaluation experiments are conducted on multiple scenarios, including medical polyp dataset (ETIS [Silva et al., 2014], CVC-Clinic DB/CVC-612 [Bernal, 2015], CVCColon DB [Tajbakhsh et al., 2015], En-do Scene [V azquez et al., 2017], Kvasir [Jha et al., 2020]), industrial dataset (Real-IAD [Wang et al., 2024], MVTec-AD[Bergmann et al., 2019]), and Floor Dirty dataset. |
| Dataset Splits | Yes | train-real(1450) 93.7 88.9 78.7 70.6 90.0 83.3 91.7 86.4 80.8 72.7 86.98 80.38... The Real-IAD dataset, split into easy (10 objects) and hard (20 objects) based on generation difficulty, reveals that segformer trained with our synthetic samples achieves 1.07% and 5.03% higher average m Io U on Mv Tec-AD and Real-IAD, respectively, than segformer trained on the samples generated by Anomaly Diffusion. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments. |
| Software Dependencies | No | Using Lo RA [Hu et al., 2022] to add image conditions for Stable Diffusion is adopted as the baseline generative model. Segformer [Xie et al., 2021] is adopted as the baseline segmentation model. Adam W is adopted as the optimizer. |
| Experiment Setup | Yes | Adam W is adopted as the optimizer. Input and output images are constrained to 512 512. The learning rate is set as 10 5. The batch size is set as 4. For the inference of the diffusion model, the classifier free guidance scale is set as 7. We set the factor α for Lal to 0.7. |