reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention

Authors: Jiangning Wei, Lixiong Qin, Bo Yu, Tianjian Zou, Chuhan Yan, Dandan Xiao, Yang Yu, Lan Yang, Ke Li, Jun Liu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments confirm that VA-AR achieves state-of-the-art performance on the same five datasets, demonstrating VA-AR s effectiveness across a broad spectrum of action recognition scenarios. We conducted a comprehensive comparison with 11 methods based on skeleton data for action recognition. The experiments covered five large datasets, with published results copied into the paper, and unpublished results obtained through retraining on these datasets. We conducted rigorous validations on both Joint data and multi-modalities data.
Researcher Affiliation	Academia	1Beijing University of Posts and Telecommunications 2Macau University of Science and Technology 3China Institute of Sport Science 4Beijing Sport University EMAIL
Pseudocode	No	The paper provides architectural diagrams and mathematical formulas but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code github.com/Trinity Neo99/VA-AR official
Open Datasets	Yes	We meticulously selected five representative action recognition datasets to comprehensively assess the performance of the proposed method. These datasets comprise NTU RGB+D (NTU-60) (Shahroudy et al. 2016), NTU RGB+D (NTU-120) (Liu et al. 2019), P2A (Bian et al. 2022), Olympic Badminton (Ghosh, Singh, and Jawahar 2018), and Fine Gym (Shao et al. 2020).
Dataset Splits	No	The paper refers to 'X-Sub', 'X-View', and 'X-Set' as splits used for reporting results in Table 1 for specific datasets, which are standard benchmarks. However, it does not explicitly provide the split percentages, sample counts, or the methodology for these splits within the main text. It also mentions 'We mixed the test sets of the five datasets and carefully distinguished them based on action speed' but this is for analysis, not the original experimental splits.
Hardware Specification	Yes	During the experimental process, we employed two NVIDIA 3090 GPUs for training, which encompassed a total of 60 training epochs.
Software Dependencies	No	The paper mentions using a Graph Convolutional Network (GCN) and SGD optimizer but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup	Yes	We have selected to use the Graph Convolutional Network (GCN) as the Spatial Module and have configured three STBlocks. Additionally, we have employed three different window sizes of 4, 8, and 16. During the experimental process, we employed two NVIDIA 3090 GPUs for training, which encompassed a total of 60 training epochs. For the optimizer, we have chosen SGD with a momentum of 0.9 and a weight decay parameter of 0.0001, with batch size setting to 32. The maximum temporal lengths for the NTU-60, NTU-120, and Fine Gym datasets were set to 256; whereas, for the P2A and Olympic Badminton datasets, the maximum temporal lengths were set to 128.