reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Local Patterns Generalize Better for Novel Anomalies

Authors: Yalong Jiang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on popular benchmark datasets demonstrate the achievement of state-of-the-art performance. Code is available at https://github.com/Allen YLJiang/ Local-Patterns-Generalize-Better/. 4 EXPERIMENTS AND RESULTS This section compares the proposed method with state-of-the-art ones and presents ablation studies.
Researcher Affiliation	Academia	Yalong Jiang Beihang University Allen EMAIL
Pseudocode	Yes	Algorithm 1 Two-Stage Process for Identifying Spatial Local Patterns... Algorithm 2 Algorithm for Converting Declarative Sentences to Interrogative Sentences
Open Source Code	Yes	Code is available at https://github.com/Allen YLJiang/ Local-Patterns-Generalize-Better/.
Open Datasets	Yes	Experiments are conducted on seven datasets. The training sets of Shanghai Tech, Avenue and UCSD Ped2 contain only normal events and anomalies reside in test data. (1) Shanghai Tech dataset Liu et al. (2018)... (2) CUHK Avenue dataset Lu et al. (2013)... (3) Ubnormal dataset Acsintoae et al. (2022)... (4) NWPU Campus dataset Cao et al. (2023)... (5) UCSD Ped2 dataset Li et al. (2014)... (6) UCF Crime dataset Sultani et al. (2018)... (7) XD Violence Wu et al. (2020)... SMM, with NSMM = 3 state machines, is trained on the COCO-Caption dataset Lin et al. (2014).
Dataset Splits	Yes	(1) Shanghai Tech dataset Liu et al. (2018) includes 330 training videos and 107 test videos. (2) CUHK Avenue dataset Lu et al. (2013) involves 16 training videos and 21 test videos. (3) Ubnormal dataset Acsintoae et al. (2022) is divided into a training set with 268 videos, a validation set with 64 videos, and a test set with 211 videos. (4) NWPU Campus dataset Cao et al. (2023) comprises 43 scenes, 28 classes of anomalies and 16 hours of video footage. (5) UCSD Ped2 dataset Li et al. (2014) contains 16 normal training videos and 12 test videos. (6) UCF Crime dataset Sultani et al. (2018) includes 1610 training videos in which 800 contain only normal behaviors. The test set includes 290 videos in which 140 include anomalies. (7) XD Violence Wu et al. (2020) includes 4754 videos where 2349 are non-violent and 2405 are violent. There are 3954 training videos and 800 test videos where 500 are violent.
Hardware Specification	Yes	Implementations are based on Pytorch Pytorch (2018) and a NVIDIA A100 GPU... All experiments are conducted on an NVIDIA A100 GPU and an Intel(R) Xeon(R) Gold 6248R CPU.
Software Dependencies	No	Implementations are based on Pytorch Pytorch (2018) and a NVIDIA A100 GPU. RM is trained on benchmark videos without anomalies. The influences of RM s number of layers will be shown in Appendix F. The evaluations on operational efficiency will be detailed in Appendix H. ... To enhance the spatial local patterns obtained from Stage 2, this paper proposes to encode frames into H.265 (HEVC) videos using FFmpeg Zeng et al. (2016). ... Nvocabulary is vocabulary size, according to BERT tokenizer Devlin et al. (2018). ... The conversion is based on nltk library Hardeniya et al. (2016)
Experiment Setup	Yes	To capture more contexts, bounding boxes are expanded by 50% on both sides horizontally and vertically... For image region i at t, the output of backbone and ITAM are HI i (t) RSd Vd and FI i (t) RNq Hd which satisfy Sd = 257, Vd = 1408, Nq = 32, Hd = 768... RM has Dh = 512 in intermediate layers. SMM, with NSMM = 3 state machines, is trained on the COCO-Caption dataset Lin et al. (2014)... All weights are initialized with distribution N(0, 0.02). Training spans 20 epoches with initial learning rate 5 10 5 and decay 0.99. RM takes concatenated GI i (t) and MI i (t) as input, with Re LU activations. It is trained using Adam optimizer with learning rate 10 3 for 10 epoches, using MSE loss.