reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DriftRemover: Hybrid Energy Optimizations for Anomaly Images Synthesis and Segmentation

Authors: Siyue Yao, Haotian Xu, Mingjie Sun, Siyue Yu, Jimin Xiao, Eng Gee Lim

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation demonstrates that our method improves pixel-level AP by 1.3% and F1-MAX by 1.8% in anomaly detection tasks on the MVTec dataset. Additionally, its successful application in practical scenarios highlights its effectiveness, improving Io U by 37.2% and F-measure by 25.1% with the Floor Dirt dataset.
Researcher Affiliation	Collaboration	Siyue Yao1,2 , Haotian Xu3 , Mingjie Sun4 , Siyue Yu1 , Jimin Xiao1 , Eng Gee Lim1 1Xi an Jiaotong-Liverpool University 2University of Liverpool 3Ripple Info 4Soochow University Corresponding author (EMAIL).
Pseudocode	Yes	Algorithm 1 Inference process of proposed Drift Remover Input: normal image xn, coarse mask m, input anomaly prompt p , total inference timestep T Parameter: conditional noise predictor ϵθ( ), image encoder ε( ), binary mask threshold η, last timestep γ for adding AAR module, first timestep δ for adding APO module, function repeat times Γ, energy value threshold Θ, pre-defined parameters αT , I, βt and σt Output: the synthetic anomaly image latent zs 0 1: zn 0 = ε(xn); zn T N(0, I) 2: zs T = concat zn T , m, ε xn(1 m) 3: for t = T to 1 do 4: i = 0 5: if t > γ then 6: Obtain the attention map with new normal embedding Ag t and new anomaly embedding As t. 7: ˆm = (Ag t > η) (As t > η) m Equation 7 8: while i < Γ and FR(As t, ˆm) < Θ do 9: zs t zs t σt zs t FR(As t, ˆm) Equation 8 i+ = 1 10: end while 11: else if t < δ then 12: while i < Γ and FO f(zr), f(zs t ) < Θ do 13: zs t zs t σt zs t FO f(zr), f(zs t ) Equation 10 i+ = 1 14: end while 15: end if 16: zs t 1 = 1 αt zs t 1 αt 1 αt ϵθ(zs t , p , t) + βt I Equation 2 17: zs t 1 zs t 1m + ( αt 1zn 0 + 1 αt 1I)(1 m) Equation 6 18: end for 19: return zs 0
Open Source Code	Yes	The code is available at https://github.com/JJessica Yao/Drift Remover.
Open Datasets	Yes	We evaluate our Drift Remover on MVTec [Bergmann et al., 2019] and Floor Dirt dataset. MVTec s original training set consists of 3,629 normal images without any anomaly, while its original test set contains 467 normal images and 1,258 anomaly images along with their corresponding mask labels for the anomaly areas.
Dataset Splits	Yes	MVTec s original training set consists of 3,629 normal images without any anomaly, while its original test set contains 467 normal images and 1,258 anomaly images along with their corresponding mask labels for the anomaly areas. Subsequently, followed by [Hu et al., 2023], we randomly select 1/3 of the abnormal images for training Drift Remover and the remaining images are used to test the results of the downstream tasks. The Floor Dirt dataset is collected from robotic vacuum cleaners, containing two types of anomalies: stains on the floor (500 images) and pet faeces on the floor (458 images). In our experiments, 3/5 of anomalous images are randomly selected for training our Drift Remover, and 2/5 are used for testing downstream tasks.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments. It mentions 'Our pipeline is built on Stable Diffusion V1.5' which refers to a software model.
Software Dependencies	Yes	Our pipeline is built on Stable Diffusion V1.5 [Rombach et al., 2022], training it for 2,000 epochs with batch size of 4 and image size of 512.
Experiment Setup	Yes	Our pipeline is built on Stable Diffusion V1.5 [Rombach et al., 2022], training it for 2,000 epochs with batch size of 4 and image size of 512. The optimizer Adam W utilizes a scaled learning rate initialized to 1e-4. We use 20 steps and a guidance scale of 3.5 for image generation, producing 1,000 images per class for evaluation and training. The threshold Θ and iteration cap Γ are 0.01 and 5. The last timestep γ for adding AAR module is 600, while the first timestep δ for adding APO module is 300. The binary threshold η is 180, patch size v is 3, text dimension q is 768, head number h is 8 and reduction factors k for each layer are 1, 2 and 4.