Causal-Inspired Multitask Learning for Video-Based Human Pose Estimation
Authors: Haipeng Chen, Sifan Wu, Zhigang Wang, Yifang Yin, Yingying Jiao, Yingda Lyu, Zhenguang Liu
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our method outperforms state-of-the-art methods on three large-scale benchmark datasets. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Jilin University, 2Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, 3College of Computer Science and Technology, Zhejiang Gongshang University, 4Institute for Infocomm Research (I2R), A*STAR, 5Public Computer Education and Research Center, Jilin University, 6The State Key Laboratory of Blockchain and Data Security, Zhejiang University, 7Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security, EMAIL, EMAIL, EMAIL, yin EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and procedures in paragraph text and figures, but does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing code, nor does it provide any links to a code repository. |
| Open Datasets | Yes | We evaluate the proposed CM-Pose for videobased human pose estimation in three widely used datasets: Pose Track2017 (Iqbal, Milan, and Gall 2017), Pose Track2018 (Andriluka et al. 2018), and Pose Track2021 (Doering et al. 2022). |
| Dataset Splits | Yes | Pos Track2017 includes 80,144 pose annotations and has two subsets, i.e., training (train) and validation (val) with 250 videos and 50 videos (split according to the official protocol), respectively. Pose Track2018 largely increases the number of video clips and pose annotations including 593 videos for training, 170 videos for validation, and the total number of pose annotations is 153,615. Pose Track2018 also introduces an additional flag characterizing joint visibility. Pose Track2021 further increases the number of pose annotations for small or crowded persons, including 177,164 labels. All three datasets identify 15 keypoints and the training set is densely labeled in the center 30 frames, while the validation set contains additional pose annotations every 4 frames. |
| Hardware Specification | Yes | We implement our method CM-Pose for human pose estimation with Pytorch, which is trained on 2 Nvidia Geforce RTX 4090 GPUs and terminated with 20 epochs. |
| Software Dependencies | No | We implement our method CM-Pose for human pose estimation with Pytorch... The paper mentions Pytorch but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | We set the image size as 256 192. The time span ω is set to 1. The number of keypoint tokens K is 15. We use Adam W optimizer to train the model with an initial learning rate of 2e 4 (decays to 2e 5, 2e 6, 2e 7 at the 5-th, 12-th, 18-th epochs, respectively). |