reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dual Conditioned Motion Diffusion for Pose-Based Video Anomaly Detection

Authors: Hongsong Wang, Andi Xu, Pinle Ding, Jie Gui

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four datasets demonstrate that our method dramatically outperforms state-of-the-art methods and exhibits superior generalization performance. Experiments conducted on popular human-related anomaly detection datasets demonstrate the superior performance of our proposed method compared to state-of-the-art approaches. Our approach consistently beats the other state-of-the-art methods on the four benchmarks. In order to more comprehensively demonstrate the efficacy of our proposed framework, we undertake ablation studies, examining factors such as DCT, United Association Discrepancy (UAD), Conditioned Embedding (CE), and Mask Completion (MC).
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China 3School of Cyber Science and Engineering, Southeast University, Nanjing 210096, China 4 Engineering Research Center of Blockchain Application, Supervision And Management (Southeast University), Ministry of Education, China 5 Purple Mountain Laboratories, Nanjing 210000, China EMAIL
Pseudocode	Yes	Algorithm 1: Training procedure of the proposed framework. Algorithm 2: Inference procedure of the proposed framework.
Open Source Code	Yes	Code https://github.com/guijiejie/DCMD-main
Open Datasets	Yes	We conduct experiments on four popular benchmarks: Human-related Shanghai Tech Campus (HR-STC), Human-related CUHK Avenue (HR-Avenue), HR-UBnormal, and UBnormal. Our approach outperforms the recent diffusionbased method Mo Co DAD (Flaborea et al. 2023) by 1.0%, 1.0%, 0.6%, and 0.7% in AUC scores on the HR-STC, HR-Avenue, HR-UBnormal, and UBnormal datasets, respectively.
Dataset Splits	No	The paper does not provide specific training/test/validation splits (e.g., percentages or counts) for the overall datasets (HR-STC, HR-Avenue, HR-UBnormal, UBnormal). It only mentions how motion sequences within a window are split into history and future frames: 'For extracting motion sequences, we employ a window size of 7 frames, where the first 3 frames comprise the historical motion sequences, and the subsequent 4 frames represent the future motion sequences.'
Hardware Specification	Yes	The experiments are conducted on an NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' but does not specify version numbers for any programming languages, libraries, or other software components used in the implementation.
Experiment Setup	Yes	We train the network end-to-end using the Adam optimizer with a learning rate of 1e 4 that is decayed every 36 epochs. The diffusion process employs cosine variance scheduling with β1 = 1e 4, βT = 2e 2, and T = 10. We set λ = 0.01. The hidden sizes of the encoder for the reconstruction branch are (512, 256), and the dimension of the hidden embedding is 256. The noise prediction network consisted of 6 layers of motion transformer blocks, where the number of heads is 8, and the hidden dimension is 512. The batch size is set to 4096 for HR-STC and 1024 for HR-Avenue.