reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards VLM-based Hybrid Explainable Prompt Enhancement for Zero-Shot Industrial Anomaly Detection

Authors: Weichao Cai, Weiliang Huang, Yunkang Cao, Chao Huang, Fei Yuan, Bob Zhang, Jie Wen

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on seven realworld industrial anomaly detection datasets have shown that the proposed method not only outperforms recent SOTA methods, but also its explainable prompts provide the model with a more intuitive basis for anomaly identification.
Researcher Affiliation	Academia	Weichao Cai1 , Weiliang Huang2 , Yunkang Cao3 , Chao Huang4 , Fei Yuan1 , Bob Zhang2 , Jie Wen5 1School of Information, Xiamen University 2Department of Computer and Information Science, University of Macau 3School of Robotics, Hunan University 4School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University 5School of Computer Science & Technology, Harbin Institute of Technology, Shenzhen EMAIL, EMAIL, EMAIL EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using figures and textual descriptions (e.g., Figures 1, 2, 3 provide overviews and detailed components), but it does not contain explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an unambiguous statement of code release or a link to a source-code repository for the methodology described.
Open Datasets	Yes	We conduct experiments using the seven industrial anomaly detection datasets for all experiments: MVTec AD [Bergmann et al., 2021], Vis A [Zou et al., 2022], MPDD [Jezek et al., 2021], BTAD [Mishra et al., 2021], KSDD [Tabernik et al., 2020], DAGM [Wieler and Hahn, 2007], and DTD-Synthetic [Aota et al., 2023].
Dataset Splits	No	The paper lists several datasets used for experiments, but it does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to predefined splits).
Hardware Specification	Yes	All experiments were performed with a single NVIDIA A100 GPU (80GB).
Software Dependencies	No	We adopted QWen2-VL-72B [Wang et al., 2024b] to generate detailed descriptions of the anomalies. Furthermore, QWen2.5-7B [Yang et al., 2024] is utilized to extract anomalous information and judge the presence of anomalies. The pre-trained CLIP (Vi T-L/14@336px) [Radford et al., 2021] is employed as the backbone for subsequent ZSIAD models, extracting patch embeddings from the 6th, 12th, 18th, and 24th Vi T blocks. DINOv2 (Vi T-S) [Oquab et al., 2024] is adopted as the VFM. While specific models/architectures are named with their respective publications, the paper does not list specific software libraries (e.g., PyTorch, TensorFlow) with version numbers.
Experiment Setup	Yes	We trained the proposed method for 5 epochs with a learning rate of 0.01.