Boosting Fine-Grained Visual Anomaly Detection with Coarse-Knowledge-Aware Adversarial Learning
Authors: Qingqing Fang, Qinliang Su, Wenxi Lv, Wenchao Xu, Jianxing Yu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on four medical datasets and two industrial datasets demonstrate the effectiveness of our method in improving the detection and localization performance. |
| Researcher Affiliation | Academia | 1 School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2 Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou, China 3 Department of Computing, The Hong Kong Polytechnic University, Hong Kong SAR 4 School of Artificial Intelligence, Sun Yat-sen University, Guangdong, China 5 Pazhou Lab, Guangzhou, 510330, China EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the proposed method through equations and textual explanations, including theorems and proofs, but does not contain any structured pseudocode or algorithm blocks. For example, "Proof. Please see the proof in the extended version." refers to theoretical proof, not an algorithm. |
| Open Source Code | Yes | Code https://github.com/Faustinaqq/CKAAD |
| Open Datasets | Yes | Medical datasets: i) ISIC2018 (Tschandl, Rosendahl, and Kittler 2018; Codella et al. 2019): The ISIC2018 challenge dataset (task 3) contains 7 categories and we classify NV (nevus) as normal, the rest 6 categories as abnormal. ii) Chest X-rays (Kermany et al. 2018) contains normal and abnormal Chest X-rays scans. iii) Br35H(Hamada 2020; Zhou et al. 2024): Brain Tumor Detection dataset contains non-tumorous and various tumorous images. iv) OCT (Kermany et al. 2018): Retinal optical coherence tomography (OCT) contains normal OCT scans and three types of scans with diseases. Industrial dataset: i)MVTec AD (Bergmann et al. 2019) is a widely known industrial dataset comprising 15 classes with 5 textures and 10 objects. ii)Visa (Zou et al. 2022) is an industrial dataset containing 12 classes. |
| Dataset Splits | No | The paper discusses the ratio of collected anomalies (rl) in the training data and the number of anomaly types (k) used for training, for example, "Using only 1% anomalies, our method improves the AUC...". However, it does not provide specific train/test/validation splits (e.g., percentages or exact counts) for the overall datasets themselves. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions the model architectures used (Resnet18, Wide Resnet50). |
| Software Dependencies | No | The paper mentions using the Adam optimizer and setting specific hyperparameters (e.g., learning rates), but it does not specify any software or library names with their version numbers (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | All images are resized to 256. α, γ are set to 0.5. λ is set to 0.02. Adam optimizer is utilized with β = (0.5, 0.999). The learning rate for the auto-encoder is set to 1e-03 for medical datasets, 5e-03 for industrial datasets, and 1e-04 for the discriminator. Resnet18 is chosen as the backbone and S = {2, 3} for medical datasets. Wide Resnet50 and S = {1, 2, 3} are set for industrial datasets because the anomalies are more subtle. |