Rethinking Masked Data Reconstruction Pretraining for Strong 3D Action Representation Learning
Authors: Tao Gong, Qi Chu, Bin Liu, Nenghai Yu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on NTU-60, NTU-120, and PKU-MMD datasets show that the proposed pre-training strategy achieves stateof-the-art results without bells and whistles. ... Experiments Datasets and Implementation Details ... Main Results ... Ablation Study |
| Researcher Affiliation | Academia | 1School of Cyber Science and Technology, University of Science and Technology of China 2Anhui Province Key Laboratory of Digital Security 3the CCCD Key Lab of Ministry of Culture and Tourism EMAIL |
| Pseudocode | No | The paper describes methods using mathematical equations and textual explanations, and provides architectural diagrams (Figure 1, Figure 2), but does not contain explicitly structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper discusses the proposed methodology and presents experimental results, but does not include any explicit statements about releasing source code for this work, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Following previous works (Mao et al. 2022, 2023; Shah et al. 2023), in this paper, we adopt the evaluation protocols crosssubject (X-sub) and cross-view (X-view) for NTU-RGB+D 60 (Shahroudy et al. 2016) dataset, cross-subject (X-sub) and cross-setup (X-set) for NTU-RGB+D 120 (Liu et al. 2020a) dataset. We also evaluate our method on the PKUMMD II (Chunhui et al. 2017) (PKU-II) phase. |
| Dataset Splits | Yes | Following previous works (Mao et al. 2022, 2023; Shah et al. 2023), in this paper, we adopt the evaluation protocols crosssubject (X-sub) and cross-view (X-view) for NTU-RGB+D 60 (Shahroudy et al. 2016) dataset, cross-subject (X-sub) and cross-setup (X-set) for NTU-RGB+D 120 (Liu et al. 2020a) dataset. ... in the semi-supervised evaluation protocol, ... we report the performance on the NTU-60 dataset when using 1% and 10% of the training set. |
| Hardware Specification | No | The paper details experimental results and implementation details, but does not provide specific hardware details such as GPU/CPU models or other computational resources used for the experiments. |
| Software Dependencies | No | The paper discusses hyperparameters and refers to previous work for other settings, but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The momentum hyper-parameter µ, the number of features NQ, and the loss weight λ are set to 0.999, 65536, and 0.0001, respectively. ... in linear evaluation protocol, the pre-trained backbone is fixed and a post-attached linear classifier is trained with supervision for 100 epochs with a batch size of 256 and a learning rate of 0.1. The learning rate is decreased to 0 by the cosine decay schedule. ... in finetuned evaluation protocol, an MLP head is attached to the pre-trained backbone, and the whole network is fully fine-tuned for 100 epochs with a batch size of 48. The learning rate is linearly increased to 3e-4 from 0 in the first 5 warm-up epochs and then decreased to 1e-5 by the cosine decay schedule. We also adopt layer-wise lr decay (Clark et al. 2020) following (Bao et al. 2022). |