Multi-View 3D Human Pose Estimation with Weakly Synchronized Images
Authors: Ling Li, Ruiwen Gu, Chongyang Wang, Junliang Xing, Xinchun Yu, Xiao-Ping Zhang
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compared our model with several advanced 3D human pose estimation models (Tome et al. 2018; Iskakov et al. 2019; Zhang et al. 2021) on the Weak Sync Pose3D and Human3.6M (Ionescu et al. 2013) datasets, with the results shown in Table 1 and Table 2, respectively. Table 1 shows the test results of our model after training on the Weak Sync Pose3D dataset. Table 2 presents the performance of our model on the Human3.6M dataset, including both the original data and the one-frame frameshift weakly synchronized data. We use common evaluation metrics, such as mean per-joint position error (MPJPE), to measure the accuracy of denoised 3D pose against ground truth data. Ablation Study |
| Researcher Affiliation | Academia | 1Shenzhen Key Laboratory of Ubiquitous Data Enabling, Shenzhen International Graduate School, Tsinghua University, China, Shenzhen 2Department of Computer Science and Technology, Tsinghua University 3 West China Hospital, Sichuan University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the methodology using diagrams (Figure 2, Figure 3) and textual descriptions, but there are no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Human3.6M (Ionescu et al. 2013), CMU-panoptic (Joo et al. 2015), MPI-INF-3DHP (Mehta et al. 2017), among others, are the benchmark datasets for multi-view 3D human pose. Human3.6M (Ionescu et al. 2013) is a widely used multi-view 3D human pose dataset. |
| Dataset Splits | Yes | The Weak Sync Pose3D dataset is the first multi-view weakly synchronized 3D human pose estimation dataset. It comprises a total of 835, 272 frames of action, with 624, 996 frames for training, 93, 816 frames for validation, and 116, 460 frames for testing. We trained on 5 subjects (S1, S5, S6, S7, S8) and tested on 2 subjects (S9, S11). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used to run the experiments. |
| Software Dependencies | No | The paper mentions general software concepts like 'convolution neural networks and Transformer models' and 'diffusion models', but does not list any specific software dependencies with version numbers (e.g., PyTorch 1.9, Python 3.8, CUDA 11.1). |
| Experiment Setup | No | The paper describes the model and diffusion process but does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or other detailed training configurations. |