Kernel-Aware Graph Prompt Learning for Few-Shot Anomaly Detection

Authors: Fenfang Tao, Guo-Sen Xie, Fang Zhao, Xiangbo Shu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on MVTec AD and Vis A datasets show that KAG-prompt achieves state-of-the-art FSAD results for image-level/pixel-level anomaly detection. We further validate the effectiveness of KAG-prompt through comprehensive ablation studies. Experiments Settings Datasets. We mainly conduct experiments on the MVTec AD (Bergmann et al. 2019) and Vis A (Zou et al. 2022) datasets.
Researcher Affiliation Academia 1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China 2School of Intelligence Science and Technology, Nanjing University, Suzhou, China
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations, but it does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code Yes Code https://github.com/CVL-hub/KAG-prompt.git
Open Datasets Yes We mainly conduct experiments on the MVTec AD (Bergmann et al. 2019) and Vis A (Zou et al. 2022) datasets.
Dataset Splits Yes The MVTec AD dataset contains 5,354 high-resolution images of 5 textures and 10 objects. The training set contains 3,629 sample images without anomalies. The test set contains 1,725 images including normal and anomalous samples. The Vis A dataset has 12 subsets containing 10,821 high-resolution images, of which 9,621 are normal images and 1,200 are anomalous images. As with Anoamly GPT (Gu et al. 2024b), we use the training set of one dataset as well as the synthesized anomalous images for training and perform few-shot testing on the other dataset.
Hardware Specification Yes Two RTX-3090 GPUs are used for acceleration during training.
Software Dependencies No The paper mentions using 'Image Bind-Huge' as an image encoder but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup Yes During training, the learning rate is set to 1e-3, batch size to 16, and the number of iterations T for graph prompts is 5. The model is trained for 50 epochs on the MVTec AD dataset and 80 epochs on the Vis A dataset. We set the fusion coefficient γ to 0.1 and select the top-30 scores using a top-k strategy. We vary k from 1 to 80 and obtain the best result 91.62% when k = 40, after which the performance decreases as k increases. As such, we set k = 30 in all experiments. We vary γ from 0 to 0.5, as shown in Fig. 6. The results are best at γ = 0.1. Therefore, we set γ = 0.1 for all experiments.