reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Authors: Yongle Huang, Haodong Chen, Zhenbang Xu, Zihan Jia, Haozhou Sun, Dian Shao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that Se FAR achieves stateof-the-art performance on two FAR datasets, Fine Gym and Fine Diving, across various data scopes, as well as two classical coarse-grained datasets, UCF101 and HMDB51. Further analysis and ablation studies validate the effectiveness of our designs.
Researcher Affiliation	Academia	1Unmanned System Research Institute, Northwestern Polytechnical University, Xi an, China 2School of Automation, Northwestern Polytechnical University, Xi an, China 3School of Computer Science, Northwestern Polytechnical University, Xi an, China 4School of Software, Northwestern Polytechnical University, Xi an, China EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in narrative text and mathematical formulas without presenting a distinct pseudocode block or algorithm.
Open Source Code	Yes	Code https://github.com/Kyle Huang9/Se FAR
Open Datasets	Yes	We perform evaluations on fine-grained datasets Gym99, Gym288 (Shao et al. 2020), and Fine Diving (Xu et al. 2022a), as well as coarse-grained datasets UCF-101 (Soomro 2012) and HMDB-51 (Kuehne et al. 2011), using Top-1 accuracy as metrics. Additionally, we use the Something-Something V2 (Sth.Sth.) dataset (Goyal et al. 2017) in ablation studies.
Dataset Splits	Yes	The labeling rates of the data are indicated by 5% , 10% , and 20% in the datasets. (Table 1 caption)
Hardware Specification	No	The paper does not provide specific hardware details (GPU models, CPU types, etc.) used for running its experiments.
Software Dependencies	No	The paper mentions various models and frameworks like Vi T, Time Sformer, Fix Match, Vicuna-7B, CLIP-Vi T, and EVA-CLIP, but does not provide specific version numbers for any underlying software libraries or programming languages.
Experiment Setup	Yes	We employ the Vi T (Dosovitskiy 2020) extended model Time Sformer (Bertasius, Wang, and Torresani 2021) as the backbone. We instantiate the Se FAR-S model based on Vi T-S... We configure the sampling combination by default as {2 2 4} for Se FAR, as commonly used 8-frame input. Tables 1 and 2 also specify the number of input frames (#F 8) and epochs (Epoch 30) for Se FAR.