reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization

Authors: Wenhao Shen, Wanqi Yin, Xiaofeng Yang, Cheng Chen, Chaoyue Song, Zhongang Cai, Lei Yang, Hao Wang, Guosheng Lin

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that ADHMR outperforms current state-of-the-art methods.
Researcher Affiliation	Collaboration	1Nanyang Technological University 2Sense Time Research 3The Hong Kong University of Science and Technology (Guangzhou). Correspondence to: Guosheng Lin <EMAIL>, Hao Wang <EMAIL>.
Pseudocode	No	The paper describes the methodology in prose and mathematical formulations (e.g., Equation 1, 2, 3, 4, 5, 6), but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://github.com/shenwenhao01/ADHMR.
Open Datasets	Yes	HMR-Scorer is trained on five datasets, including HI4D (Yin et al., 2023), BEDLAM (Black et al., 2023), DNA-Rendering (Cheng et al., 2023), GTA-Human (Cai et al., 2024b), and SPEC (Kocabas et al., 2021). In Table 2, we compare the accuracy of the ADHMR with state-of-the-art methods on two widely used benchmark datasets Human3.6M (Ionescu et al., 2013) and 3DPW (Von Marcard et al., 2018).
Dataset Splits	Yes	We adopt the original test set of the two selected datasets. We first finetune the base model on the training sets of the two target benchmarks (3DPW and Human3.6M).
Hardware Specification	No	The paper does not explicitly mention specific hardware details like GPU models, CPU types, or memory used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	No	The paper mentions 'M = 100 for all models' and 'M = 200 for all models' for the number of predictions, and 'We set the filter threshold τ = 0.6' for data cleaning. However, it lacks specific hyperparameter values such as learning rates, batch sizes, epochs, or detailed optimizer configurations.