Boosting Fine-Grained Visual Anomaly Detection with Coarse-Knowledge-Aware Adversarial Learning

Authors: Qingqing Fang, Qinliang Su, Wenxi Lv, Wenchao Xu, Jianxing Yu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on four medical datasets and two industrial datasets demonstrate the effectiveness of our method in improving the detection and localization performance.
Researcher Affiliation Academia 1 School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2 Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou, China 3 Department of Computing, The Hong Kong Polytechnic University, Hong Kong SAR 4 School of Artificial Intelligence, Sun Yat-sen University, Guangdong, China 5 Pazhou Lab, Guangzhou, 510330, China EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the proposed method through equations and textual explanations, including theorems and proofs, but does not contain any structured pseudocode or algorithm blocks. For example, "Proof. Please see the proof in the extended version." refers to theoretical proof, not an algorithm.
Open Source Code Yes Code https://github.com/Faustinaqq/CKAAD
Open Datasets Yes Medical datasets: i) ISIC2018 (Tschandl, Rosendahl, and Kittler 2018; Codella et al. 2019): The ISIC2018 challenge dataset (task 3) contains 7 categories and we classify NV (nevus) as normal, the rest 6 categories as abnormal. ii) Chest X-rays (Kermany et al. 2018) contains normal and abnormal Chest X-rays scans. iii) Br35H(Hamada 2020; Zhou et al. 2024): Brain Tumor Detection dataset contains non-tumorous and various tumorous images. iv) OCT (Kermany et al. 2018): Retinal optical coherence tomography (OCT) contains normal OCT scans and three types of scans with diseases. Industrial dataset: i)MVTec AD (Bergmann et al. 2019) is a widely known industrial dataset comprising 15 classes with 5 textures and 10 objects. ii)Visa (Zou et al. 2022) is an industrial dataset containing 12 classes.
Dataset Splits No The paper discusses the ratio of collected anomalies (rl) in the training data and the number of anomaly types (k) used for training, for example, "Using only 1% anomalies, our method improves the AUC...". However, it does not provide specific train/test/validation splits (e.g., percentages or exact counts) for the overall datasets themselves.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the model architectures used (Resnet18, Wide Resnet50).
Software Dependencies No The paper mentions using the Adam optimizer and setting specific hyperparameters (e.g., learning rates), but it does not specify any software or library names with their version numbers (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes All images are resized to 256. α, γ are set to 0.5. λ is set to 0.02. Adam optimizer is utilized with β = (0.5, 0.999). The learning rate for the auto-encoder is set to 1e-03 for medical datasets, 5e-03 for industrial datasets, and 1e-04 for the discriminator. Resnet18 is chosen as the backbone and S = {2, 3} for medical datasets. Wide Resnet50 and S = {1, 2, 3} are set for industrial datasets because the anomalies are more subtle.