reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Boosting Protein Graph Representations through Static-Dynamic Fusion

Authors: Pengkang Guo, Bruno Correia, Pierre Vandergheynst, Daniel Probst

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate significant improvements over structure-based approaches across three distinct tasks: atomic adaptability prediction, binding site detection, and binding affinity prediction. Our results validate that combining static and dynamic information provides complementary signals for understanding proteinligand interactions, offering new possibilities for drug design and structural biology applications. 4. Experiments We evaluate our approach using the MISATO dataset (Siebenmorgen et al., 2024b), which contains 19,443 protein-ligand complexes derived from PDBbind (Su et al., 2018; Liu et al., 2017; Wang et al., 2005). Table 1 presents the performance comparison across different graph representations using four architectures.
Researcher Affiliation	Academia	1 Ecole Polytechnique F ed erale de Lausanne, Lausanne, Switzerland 2Wageningen University & research, Wageningen, The Netherlands. Correspondence to: Pengkang Guo <EMAIL>.
Pseudocode	No	The paper describes methods in prose and mathematical equations but does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	A.7. Code Availability Implementation will be made available at https://github.com/PKGuo/protein-static-dynamic-fusion.git.
Open Datasets	Yes	4.1. Dataset We evaluate our approach using the MISATO dataset (Siebenmorgen et al., 2024b), which contains 19,443 protein-ligand complexes derived from PDBbind (Su et al., 2018; Liu et al., 2017; Wang et al., 2005).
Dataset Splits	Yes	To ensure robust evaluation and prevent information leakage through structural similarities, the dataset is split into training (80%), validation (10%), and test (10%) sets using protein sequence clustering via Blast P, ensuring that proteins with high sequence similarity are assigned to the same split (details in Appendix A.3). A.3. Dataset Details For atomic adaptability and binding site detection tasks, we used the data splitting in the MISATO dataset splits, with 13,597 samples for training, 1,582 for validation, and 1,593 for test.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory amounts used for running experiments. It mentions 'higher memory requirements' for some models but no concrete specifications.
Software Dependencies	Yes	Each complex undergoes semi-empirical quantum mechanical refinement and 10 ns molecular dynamics simulation using the Amber20 software package.
Experiment Setup	Yes	A.2. Implementation Details and Hyperparameters We implemented our models using Py Torch Geometric. Each model consists of 5 GNN layers followed by a two-layer MLP for prediction. We trained models using the Adam optimizer with a learning rate of 1e-4 and batch size of 32. Training epochs were task-specific: 50 for atomic adaptability prediction, 200 for binding site detection, and 500 for binding affinity prediction.