reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Multimodal Distillation for Few-shot Action Recognition

Authors: Weijia Feng, Yichen Zhu, Ruojia Zhang, Chenyang Wang, Fei Ma, Xiaobao Wang, Xiaobai Li

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across multiple benchmarks demonstrate that our method significantly outperforms existing approaches. 4 Experiments 4.1 Validation Protocol Datasets. We assess our proposed method on four prominent and challenging benchmarks for few-shot action recognition: Kinetics-400 [Kay et al., 2017], Something-Something V2 [Goyal et al., 2017], HMDB51 [Wang et al., 2015], and UCF101 [Peng et al., 2018]. 4.3 Comparative Experiments 4.4 Ablation Study
Researcher Affiliation	Academia	1College of Computer and Information Engineering, Tianjin Normal University, Tianjin, China 2College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China 3Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China 4College of Intelligence and Computing, Tianjin University, Tianjin, China 5The State Key Laboratory of Blockchain and Data Security, Zhejiang University, Hangzhou, China 6Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, Hangzhou, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods through narrative text and mathematical equations, but does not include a distinct section or figure labeled 'Pseudocode' or 'Algorithm', nor does it present any structured algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about making the source code available, nor does it provide a link to a code repository for the described methodology.
Open Datasets	Yes	Datasets. We assess our proposed method on four prominent and challenging benchmarks for few-shot action recognition: Kinetics-400 [Kay et al., 2017], Something-Something V2 [Goyal et al., 2017], HMDB51 [Wang et al., 2015], and UCF101 [Peng et al., 2018].
Dataset Splits	No	In the meta-training phase, we utilize a multimodal video dataset Dtrain that encompasses base action classes Ctrain. ... In the meta-test phase, we employ a multimodal dataset Dtest, which includes novel action classes Ctest that are disjoint from the training classes (Ctest Ctrain = ). Similar to the meta-training phase, the support and query sets for each test task are constructed in the same manner. ... Within the N way K shot metalearning setting, the query set Q = {(xr i , xf i , yi)}M i=1 includes M multimodal query samples. ... The support set S = {(xr i , xf i , yi)}M+NK i=M+1 contains K multimodal samples for each of the N classes.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as exact GPU or CPU models.
Software Dependencies	No	The paper mentions using the SGD optimizer and specific pre-trained models (Res Net-50, I3D) but does not provide specific version numbers for any software dependencies, such as deep learning frameworks or programming languages.
Experiment Setup	Yes	Parameters. In the meta-training phase, the balance weight (λ, specified in Eq. 9) is uniformly set to 1.0 across all benchmarks. Training is conducted using the SGD optimizer. For both RGB and optical flow modalities, the respective networks are iteratively updated by minimizing a combined weighted loss function, which includes both cross-entropy and distillation losses, until convergence is achieved. The learning rate γ is set as 10 3.