reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Language-Assisted Feature Transformation for Anomaly Detection

Authors: EungGu Yun, Heonjin Ha, Yeongwoo Nam, Bryan Lee

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on both toy and real-world datasets validate the effectiveness of our method.
Researcher Affiliation	Industry	Eung Gu Yun SAIGE Seoul, South Korea Heonjin Ha LG Uplus Seoul, South Korea Yeongwoo Nam Alsemy Inc. Seoul, South Korea Bryan Dongik Lee Independent Seoul, South Korea
Pseudocode	Yes	E ALGORITHM The following pseudocode demonstrates the implementation of LAFT AD, using a syntax similar to Num Py, as the notation used in (Radford et al., 2021).
Open Source Code	Yes	We provide the source code of our method at https://github.com/yuneg11/LAFT.
Open Datasets	Yes	Datasets To validate our approach, we used the colored version of MNIST (Le Cun et al., 2010), Waterbirds (Sagawa et al., 2019), and Celeb A (Liu et al., 2015) datasets for semantic anomaly detection (SAD). ... we also used the MVTec AD (Bergmann et al., 2019) and Vis A (Zou et al., 2022) datasets for industrial anomaly detection (IAD).
Dataset Splits	Yes	Dataset Split Colored MNIST R denotes red, G denotes green, and B denotes blue colored digits. 0-4 and 5-9 denote the digits from 0 to 4 and from 5 to 9, respectively. Train: R/0-4 (16.67%) Test: R/0-4 (16.67%), R/5-9 (16.67%), GB/0-4 (33.33%), GB/5-9 (33.33%) Waterbirds Wbird denotes waterbirds, and Lbird denotes landbirds. Wback denotes water background, and Lback denotes land background. Train: Wbird/Wback (22.04%) Test: Wbird/Wback (11.08%), Wbird/Lback (11.08%), Lbird/Wback (38.92%), Lbird/Lback (38.92%) Celeb A Blond denotes blond hair, and Glass denotes eyeglasses. -Blond denotes nonblond hair, and -Glass denotes no eyeglasses. Train: Blond/-Glass (14.66%) Test: Blond/Glass (13.01%), Blond/-Glass (0.31%), -Blond/Glass (80.53%), -Blond/-Glass (6.15%) MVTec AD and Vis A We use the same split as Bergmann et al. (2019) and Zou et al. (2022).
Hardware Specification	Yes	We use a single NVIDIA RTX 3090 GPU for all experiments.
Software Dependencies	No	The paper mentions specific vision-language models and their backbones (CLIP Vi T-B/16, Open CLIP, EVA-CLIP, Sig LIP, Co Ca) but does not provide version numbers for general software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	Hyperparameter The only hyperparameter in LAFT is the number of PCA components d. We typically choose d from 4 to 32 when guiding an attribute and from 32 to 384 when ignoring an attribute. Refer to the Ablation Study for the impact of d on the performance. And we use k = 30 for the methods using k NN anomaly scoring (k NN and LAFT AD).