reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Self-Supervised Video Hashing with Selective State Spaces

Authors: Jinpeng Wang, Niu Lian, Jun Li, Yuting Wang, Yan Feng, Bin Chen, Yongbing Zhang, Shu-Tao Xia

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate S5VH s improvements over state-of-the-art methods, superior transferability, and scalable advantages in inference efficiency. ... We conduct extensive experiments on 4 datasets: Activity Net, FCVID, UCF101, and HMDB51, demonstrating that S5VH outperforms state-of-the-art baselines under various setups and transfers better across datasets. ... Additionally, we provide comprehensive ablations and analyses, focusing on network architecture and training strategy.
Researcher Affiliation	Collaboration	1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Harbin Institute of Technology, Shenzhen ... 4Meituan, Beijing
Pseudocode	No	The paper describes an optimization problem for hash center generation using equations and prose (e.g., 'Optimization for Hash Center Generation' section), but it does not present this as a structured pseudocode or algorithm block.
Open Source Code	Yes	Code https://github.com/gimpong/AAAI25-S5VH
Open Datasets	Yes	We conduct experiments on 4 benchmark datasets. (i) Activity Net (Caba Heilbron et al. 2015) ... (ii) FCVID (Jiang et al. 2017) ... (iii) UCF101 (Soomro, Zamir, and Shah 2012) ... (iv) HMDB51 (Kuehne et al. 2011)
Dataset Splits	Yes	(i) Activity Net ... using 9,722 videos for training. We uniformly sample 1,000 videos across 200 categories in the validation set as queries, and the remaining 3,758 videos as the database. ... (iii) UCF101 ... We use 9,537 videos for training and the database, and 3,783 videos from the test set as the query set. (iv) HMDB51 ... We use 3,570 videos for both training and database and 1,530 videos from the test set are designated as the query set.
Hardware Specification	No	We perform stress testing with them in the same computational environment, taking 5 samples as a unit to probe the maximally affordable batchsizes and measuring the average inference time per sample.
Software Dependencies	No	For the model training, we choose the Adam W optimizer with default parameters in Pytorch
Experiment Setup	Yes	For the model training, we choose the Adam W optimizer with default parameters in Pytorch, and employ a cosine annealed learning rate scheduling from 5e-4 to 1e-5. The models are trained for up to 350 epochs with 5-patience early-stopping to prevent overfitting. The default hyperparameter configurations are as below: (i) We set the mask ratio ρ = \|M\|/Nt to 0.75 on the FCVID dataset and 0.5 on the rest of the datasets. (ii) The temperature factor τ in Equations (20) and (21) is set 0.5. (iii) The number of semantic centers Nc is set to 450 on FCVID and 100 on the other datasets. ... we set the 6 layers for the encoder and 1 layer for the decoder. The latent dimensions of the encoder and decoder are set to 256 and 192, respectively.