Inverse problems with experiment-guided AlphaFold
Authors: Sai Advaith Maddipatla, Nadav Bojan, Meital Bojan, Sanketh Vedula, Paul Schanda, Ailie Marx, Alexander Bronstein
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive real-data experiments, we demonstrate the generality of our method to incorporate a variety of experimental measurements. In particular, our framework uncovers previously unmodeled conformational heterogeneity from crystallographic densities, and generates high-accuracy NMR ensembles orders of magnitude faster than the status quo. Notably, we demonstrate that our ensembles outperform Alpha Fold3 (Abramson et al., 2024) and sometimes better fit experimental data than publicly deposited structures to the Protein Data Bank (PDB, Burley et al. (2017)). |
| Researcher Affiliation | Academia | 1Technion Israel Institute of Technology, Israel. 2University of Oxford, UK. 3Institute of Science and Technology, Austria. 4Tel Hai Academic College, Israel. 5MIGAL Galilee Research Institute, Israel. |
| Pseudocode | Yes | The pseudocode for guided Alpha Fold3 and other implementation details are presented in Appendix A.1. Algorithm 1 Alpha Fold3 guidance Algorithm 2 Selecting samples using matching pursuit (Mallat & Zhang, 1993) |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It mentions using 'open-sourced Protenix (Chen et al., 2025) model' and 'official Alpha Fold3 weights and source code (Abramson et al., 2024)' but these are third-party tools/models they used, not their own implementation code. |
| Open Datasets | Yes | PDB: 7JX6 and 7F5F color coded as purple and green, respectively). (PDB: 4OLE) exhibits a multi-modal backbone distribution at 423-431 NMR structure ensemble (PDB: 2K52) NMR structure PDB 1D3Z. We used the benchmark from Mc Donald et al. (2023) 100 NMR spectra database (Klukowski et al., 2024) |
| Dataset Splits | No | The paper does not explicitly provide dataset splits (e.g., train/test/validation percentages or counts) for its experiments. It refers to specific PDB entries and datasets for evaluation but does not define how these were partitioned into subsets for training or testing their own methods, beyond selecting specific cases. |
| Hardware Specification | Yes | All computations were performed on NVIDIA H100 and L40S GPUs. |
| Software Dependencies | No | The paper mentions 'open-sourced Protenix (Chen et al., 2025) model, a Py Torch-based (Paszke et al., 2019)', 'AMBER force field (Wang et al., 2004)', 'Colab Fold implementation (Mirdita et al., 2022)', 'Gemmi (Wojdyr, 2022)', 'Adam (Diederik, 2015) optimizer', and 'pynmrstar library'. However, specific version numbers are generally not provided for key software components like PyTorch or pynmrstar, which is required for reproducibility. |
| Experiment Setup | Yes | For density-guidance, we used equation (1) as the primary log-likelihood function... We used λ = 0.1 to scale the substructure conditioner. For guidance, we used η = 0.1 in equation (5). For guidance, we evaluated η = 0.3, 0.5 in equation (5), and selected the parameter based on the number of restrained obeyed. We optimized B using Adam (Diederik, 2015) optimizer with a step size of 1.0 over 100 iterations. Across all experiments, we set the maximum ensemble size to nmax = 5. To check for structures with broken bonds, we determine all bonded atom pairs within the protein structure... exceeds τbond = 2.1 A. In addition, to check for structures with steric clashes... less than τclash = 1.1 A. |