Boosting Protein Graph Representations through Static-Dynamic Fusion
Authors: Pengkang Guo, Bruno Correia, Pierre Vandergheynst, Daniel Probst
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate significant improvements over structure-based approaches across three distinct tasks: atomic adaptability prediction, binding site detection, and binding affinity prediction. Our results validate that combining static and dynamic information provides complementary signals for understanding proteinligand interactions, offering new possibilities for drug design and structural biology applications. 4. Experiments We evaluate our approach using the MISATO dataset (Siebenmorgen et al., 2024b), which contains 19,443 protein-ligand complexes derived from PDBbind (Su et al., 2018; Liu et al., 2017; Wang et al., 2005). Table 1 presents the performance comparison across different graph representations using four architectures. |
| Researcher Affiliation | Academia | 1 Ecole Polytechnique F ed erale de Lausanne, Lausanne, Switzerland 2Wageningen University & research, Wageningen, The Netherlands. Correspondence to: Pengkang Guo <EMAIL>. |
| Pseudocode | No | The paper describes methods in prose and mathematical equations but does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | A.7. Code Availability Implementation will be made available at https://github.com/PKGuo/protein-static-dynamic-fusion.git. |
| Open Datasets | Yes | 4.1. Dataset We evaluate our approach using the MISATO dataset (Siebenmorgen et al., 2024b), which contains 19,443 protein-ligand complexes derived from PDBbind (Su et al., 2018; Liu et al., 2017; Wang et al., 2005). |
| Dataset Splits | Yes | To ensure robust evaluation and prevent information leakage through structural similarities, the dataset is split into training (80%), validation (10%), and test (10%) sets using protein sequence clustering via Blast P, ensuring that proteins with high sequence similarity are assigned to the same split (details in Appendix A.3). A.3. Dataset Details For atomic adaptability and binding site detection tasks, we used the data splitting in the MISATO dataset splits, with 13,597 samples for training, 1,582 for validation, and 1,593 for test. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory amounts used for running experiments. It mentions 'higher memory requirements' for some models but no concrete specifications. |
| Software Dependencies | Yes | Each complex undergoes semi-empirical quantum mechanical refinement and 10 ns molecular dynamics simulation using the Amber20 software package. |
| Experiment Setup | Yes | A.2. Implementation Details and Hyperparameters We implemented our models using Py Torch Geometric. Each model consists of 5 GNN layers followed by a two-layer MLP for prediction. We trained models using the Adam optimizer with a learning rate of 1e-4 and batch size of 32. Training epochs were task-specific: 50 for atomic adaptability prediction, 200 for binding site detection, and 500 for binding affinity prediction. |