Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Authors: Hang Zhou, Jiale Cai, Yuteng Ye, Yonghui Feng, Chenxing Gao, Junqing Yu, Zikai Song, Wei Yang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results in four challenging video anomaly detection datasets empirically substantiate the efficacy of our proposed approach, demonstrating that it consistently outperforms most existing methods in detecting abnormal behaviors.
Researcher Affiliation Academia Huazhong University of Science and Technology, Wuhan, China EMAIL
Pseudocode No The paper describes methods using text, equations, and diagrams (e.g., Figure 1, Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement about the release of source code, nor does it provide a link to a code repository or mention code in supplementary materials.
Open Datasets Yes We assess the effectiveness of our approach using four standard datasets prevalent in the Visual Anomaly Detection community. The datasets we employ include: Ped2... Avenue... Shanghai Tech Campus... UBnormal... We utilize the standard training and testing sets provided to assess the performance of our method for one-class VAD.
Dataset Splits Yes Ped2. It comprises 16 training videos and 12 test videos captured in static environments. Avenue. It consists of 16 training videos and 21 testing videos... Shanghai Tech Campus. It consists of 330 training videos and 107 testing videos... We utilize the standard training and testing sets provided to assess the performance of our method for one-class VAD. ... The training epochs are set to 1000 for Ped2, 300 for Avenue, 30 for Shanghai and 40 for UB. The batch size in all three datasets is 16.
Hardware Specification No The computation is completed in the HPC Platform of Huazhong University of Science and Technology.
Software Dependencies No The Adam optimizer is used to train our MA-PDM... The diffusion process settings... implemented in the same way as in DDIM. ... We employ RAFT (Teed and Deng 2020) for the rapid and effective extraction of optical flows.
Experiment Setup Yes Each frame is resize to the image size of 256x256. The seventh frame is predicted using the previous six frames. The patch memory banks capacity for all datasets is established at 16x64x256. In the training phase, patches are randomly selected and extracted from complete images, with each patch measuring 64x64 for all datasets. The Adam optimizer is used to train our MA-PDM, with a learning rate set at 0.0002. The diffusion process settings are as follows: β1 = 0.0001, β2 = 0.02, T = 1000, and the linear schedule is implemented in the same way as in DDIM. The weights of the loss function are λ = 0.1. The training epochs are set to 1000 for Ped2, 300 for Avenue, 30 for Shanghai and 40 for UB. The batch size in all three datasets is 16. During the testing phase, we employ a sliding window approach with a stride of 64. The number of reverse steps in DDIM is configured to 5. The values for α are (0, 0.2, 0.3, 1) across four datasets.