Predicting mutational effects on protein binding from folding energy

Authors: Arthur Deng, Karsten D. Householder, Fang Wu, K. Christopher Garcia, Brian L. Trippe

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate STAB-DDG, we first analyze the contributions of different techniques that lead to an improvement in zero-shot Gbind prediction accuracy, without training on Gbind data. Next, we introduce baseline methods and show that STAB-DDG is the only DL approach to match Fold X and Flex dd G; an ensemble constructed by averaging Fold X and STAB-DDG provides state-of-the-art performance. Finally, we evaluate out-of-distribution accuracy of our approach on two additional binding strength datasets: one consisting of de novo designed small protein binders, and a second consisting of T cell receptor (TCR) mimic proteins we curate.
Researcher Affiliation Academia Arthur Deng 1 Karsten Householder 1 Fang Wu 1 K. Christopher Garcia 1 Brian Trippe 1 1Stanford University. Correspondence to: Arthur Deng <EMAIL>, Brian Trippe <EMAIL>.
Pseudocode No The paper describes the methodology using narrative text and mathematical equations, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1Code: https://github.com/LDeng0205/Sta B-dd G
Open Datasets Yes with experimental G measurements for fewer than 350 distinct interfaces in the largest public curated dataset (Jankauskait e et al., 2019).
Dataset Splits Yes We cluster the complexes using the original SKEMPIv2.0 clusters based on structural homology near the binding site, resulting in 64 disjoint clusters (Jankauskait e et al., 2019). Then, we perform a random splitting to obtain 20 clusters with 1,491 mutants across 81 complexes as our test set. We report these clusters and split at https://github.com/LDeng0205/Sta B-dd G/blob/main/data/ SKEMPI/train_clusters.txt and https://github.com/LDeng0205/Sta B-dd G/blob/main/data/ SKEMPI/test_clusters.txt.
Hardware Specification Yes For Sta B-dd G, by contrast, predictions on the same dataset took 13 NVIDIA-5090 GPU-minutes with batched computation (0.2 seconds per mutation). Model finetuning of STAB-DDG took 10 hours and 5 hours on the Megascale stability dataset and the SKEMPIv2.0 training split on a single H100 GPU.
Software Dependencies Yes We use Rosetta version 3.8 with 35,000 backrub steps and average predictions across 10 models. For Fold X, initial repair steps are computed on the wild-type interface PDB followed by scoring of individual mutants. We use Fold X version 4.1.
Experiment Setup Yes In summary, we fine-tuned on the Megascale stability dataset using the ADAM optimizer with a learning rate of 3e-5 for 70 epochs with a batch size of 25,000 amino acids. We fine-tuned on SKEMPIv2.0 using the ADAM optimizer with learning rate 1e-6 for 200 epochs with a batch size of 25,000 amino acids.