Semantics-aware Test-time Adaptation for 3D Human Pose Estimation
Authors: Qiuxia Lin, Rongyu Chen, Kerui Gu, Angela Yao
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments Datasets. We follow the adaptation tasks from previous work (Zhang et al., 2020; Nam et al., 2023), using Human3.6M (Ionescu et al., 2013) as the labeled training dataset and 3DPW (Von Marcard et al., 2018) and 3DHP (Mehta et al., 2017) as the unlabeled test datasets. [...] Evaluation metrics. We report three evaluation metrics: Mean Per Joint Position Error (MPJPE) [...] 5.3. Quantitative Results [...] 5.4. Analysis Experiments Ablations on the method components. Semantics-incorporated strategies analysis. Improvement distribution. Runtime analysis. 2D fill-in analysis. |
| Researcher Affiliation | Academia | 1Department of Computer Science, National University of Singapore, Singapore. Correspondence to: Qiuxia Lin <EMAIL>. |
| Pseudocode | No | The paper describes the methodology through prose and a diagram (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Datasets. We follow the adaptation tasks from previous work (Zhang et al., 2020; Nam et al., 2023), using Human3.6M (Ionescu et al., 2013) as the labeled training dataset and 3DPW (Von Marcard et al., 2018) and 3DHP (Mehta et al., 2017) as the unlabeled test datasets. Human3.6M is a widely used indoor dataset comprising 3.6 million images annotated with 2D and 3D labels. [...] Furthermore, we validate our method on a egocentric dataset Ego Body (Zhang et al., 2022). |
| Dataset Splits | Yes | Datasets. We follow the adaptation tasks from previous work (Zhang et al., 2020; Nam et al., 2023), using Human3.6M (Ionescu et al., 2013) as the labeled training dataset and 3DPW (Von Marcard et al., 2018) and 3DHP (Mehta et al., 2017) as the unlabeled test datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper mentions several software components like Openpose (Cao et al., 2017), Adam optimizer (Kingma & Ba, 2014), Motion CLIP (Tevet et al., 2022), GPT-4o (Achiam et al., 2023), and CLIP (Radford et al., 2021), but it does not specify version numbers for these components. |
| Experiment Setup | Yes | At the start of test-time adaptation for each test video, the model parameters are initialized with the pre-trained values, following (Nam et al., 2023). We employed the Adam optimizer (Kingma & Ba, 2014) with parameters set to beta1 = 0.5, beta2 = 0.9, and a learning rate of 5.0e-5. A cosine scheduler is used with a minimum learning rate of 1.0e-6. The input images are resized to 224 × 224, and the frame number of each video segment is 60. We use a batch size of 4 and the total training epoch is 6. The hyperparameters are λ1 = 0.1, λ2 = 0.2, σ = 0.75, α = 0.9. We use Openpose (Cao et al., 2017) to provide 2D poses with a confidence threshold of 0.3 (Gu et al., 2024). |