reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

NMA-tune: Generating Highly Designable and Dynamics Aware Protein Backbones

Authors: Urszula Julia Komorowska, Francisco Vargas, Alessandro Rondina, Pietro Lio, Mateja Jamnik

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the effectiveness of the conditioner and its impact on the sample designability using 3 proteins chosen from the literature. We found that for the dynamics conditioned samples, there exist protein sequences that will fold to protein backbones with desired normal modes, and NMA-tune outperforms existing state-of-the-art. We run MD simulations on selected samples and perform Principal Component Analysis (PCA) on their trajectories.
Researcher Affiliation	Academia	1Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom 2Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy.
Pseudocode	No	The paper describes its methodology in detailed text and includes a 'Conditioning framework diagram' in Figure 6 (Appendix B), but it does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We provide the conditioner as a ready to use plug-in to the open-source model RFdiffusion. Since RFdiffusion is already able to do motif scaffolding, this constitutes an extension to joint conditioning that can easily be used by the wider research community.
Open Datasets	Yes	Dataset for training ϵθ was based on the SCOPe database (Fox et al., 2013; Chandonia et al., 2021). ... We chose: triglyceride lipase (Derewenda et al., 1992) (PDB id: 4tgl), calmodulin (Khade et al., 2021) (PDB id: 1exr), and HIV-1 protease in semi-open conformation (Hornak et al., 2006) (PDB id: 1hhp).
Dataset Splits	Yes	7139 proteins remained and we used train:validation:test split 0.8:0.1:0.1.
Hardware Specification	Yes	ϵθ is trained for 10 epochs with Adam optimiser, which took 15h on a single Nvidia A100 80GB. ... Molecular dynamics (MD) simulations were performed using GROMACS 2024 version and NVIDIA A100 80GB GPU.
Software Dependencies	Yes	Molecular dynamics (MD) simulations were performed using GROMACS 2024 version and NVIDIA A100 80GB GPU.
Experiment Setup	Yes	ϵθ is trained for 10 epochs with Adam optimiser, which took 15h on a single Nvidia A100 80GB. The learning rate 1e-4 is decreased by 0.1 after 5000 gradient updates and batch size is 32. ... We used 50 sampling steps, which is the default in RFdiffusion. ... Additionally, for targets 1hhp and 1exr we ablate the time scaling function in Table 4. ... All losses are weighted as L = 0.05 Lnoise +0.8 LNMA +0.1 Lchain +0.05 Lrg (11) where LNMA is the same as ly(x) in Equation 8.