reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Exploit Your Latents: Coarse-Grained Protein Backmapping with Latent Diffusion Models

Authors: Rongchao Zhang, Yu Huang, Yiwei Lou, Yi Xin, Haixu Chen, Yongzhi Cao, Hanpin Wang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that Lat CPB is able to backmap CG proteins effectively and achieve outstanding performance. We conduct ablation studies on the Lat CPB to evaluate the impacts of the contrastive learning (CL) and discrete latent space (DL) components on its performance. As shown in Table 2, with DL enabled alone, we notice an improvement in performance, albeit limited. When DL and CL are used together, performance across all metrics reaches its peak.
Researcher Affiliation	Academia	1Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, School of Computer Science, Peking University, Beijing, China 2National Engineering Research Center for Software Engineering, Peking University, Beijing, China 3National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 4Institute of Geriatrics&National Clinical Research Center of Geriatrics Disease, Chinese PLA General Hospital, Beijing, China EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology using mathematical formulations and descriptive text, but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described, nor does it include a specific repository link or an explicit statement about code release.
Open Datasets	Yes	We evaluate the validity of our method using the PED protein dataset (Lazar et al. 2021), a structural collection of intrinsically disordered proteins (IDPs). PED is currently the only database focused on representing the diversity of IDP collections, focusing on biologically interesting protein regions with conformational collections.
Dataset Splits	Yes	In the experiments, we utilize approximately 10,000 frames as the training set and select about 240 frames as test data, which come from four different structures: PED00055, PED00090, PED00151, and PED00218.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	Yes	The implementation environment is Py Torch 2.1 version and the Adam optimizer is applied to train the model with the learning rate of 10 3 and decayed to zero with a scheduler.
Experiment Setup	Yes	The implementation environment is Py Torch 2.1 version and the Adam optimizer is applied to train the model with the learning rate of 10 3 and decayed to zero with a scheduler. To evaluate the performance, we compare our model with the prior arts that focus on model improvement, including CGVAE (Wang et al. 2022) and Gen ZProt (Yang and G omez Bombarelli 2023). For fair comparisons, we reproduce all methods under the same implemental environment.