Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional
Authors: Sanjeev Raja, Martin Sipka, Michael Psenka, Tobias Kreiman, Michal Pavelka, Aditi S. Krishnapriyan
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our approach on varied molecular systems, obtaining diverse, physically realistic transition pathways and generalizing beyond the pre-trained model s original training dataset. We validate our approach on systems of increasing complexity. Starting with the 2D M uller-Brown potential (Section 5.1), we build intuition and show accurate estimation of reaction rates and committor functions using our sampled transition paths. On alanine dipeptide (Section 5.2), our method recovers accurate free energy barriers and outperforms metadynamics and shooting algorithms in efficiency. For fast-folding proteins (Section 5.3), OM action minimization with diffusion or flow-matching models yields transition path ensembles closely aligned with reference MD at a fraction of the cost of traditional MD. Finally, we show that OM optimization with generative models trained on tetrapeptide configurations enables zero-shot TPS on new sequences (Section 5.4). |
| Researcher Affiliation | Academia | 1Dept. of EECS, UC Berkeley 2Faculty of Mathematics and Physics, Charles University 3Done prior to joining Amazon 4Dept. of Chemical and Biomolecular Engineering, UC Berkeley 5Applied Mathematics and Computational Research, LBNL. Correspondence to: Sanjeev Raja, Aditi S. Krishnapriyan <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Onsager-Machlup Transition Path Optimization with Generative Models Algorithm 2 Initial Guess Path Generation with a Generative Model Algorithm 3 Initial Guess Path Generation with Iterative Unwrapping |
| Open Source Code | No | The text does not contain an explicit statement about the release of their source code or a link to a repository for the methodology described in this paper. It mentions using third-party tools like PyTorch and PDBFixer (https://github.com/openmm/pdbfixer) but not the authors' own implementation. |
| Open Datasets | Yes | We next consider proteins exhibiting fast dynamical transitions, for which millisecond-scale, reference MD simulations were performed in Lindorff-Larsen et al. (2011). We train a flow matching generative model on approximately 3,000 tetrapeptides simulated in Jing et al. (2024b). |
| Dataset Splits | Yes | We run 1,000 parallel simulations for 1,000 steps, yielding a total of 1 million datapoints. Of these, 800,000 are used for training, and 200,000 are reserved for validation. We perform OM optimization on 100 held-out tetrapeptide sequences not seen during training (using the same splits as in (Jing et al., 2024b)). |
| Hardware Specification | Yes | The entire optimization took on the order of hours on one NVIDIA RTX A6000 GPU, including the generation of initial trajectory. All experiments were performed on a single NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | The paper mentions software components such as PyTorch, Amber ff14SB, Adam optimizer, FEniCS, Amber ff19SB, WHAM algorithm, PDBFixer, and OpenMM, but does not provide specific version numbers for any of these. |
| Experiment Setup | Yes | Table 3. Hyperparameters used for OM action optimization on the M uller-Brown potential with diffusion. The model is trained for 10 epochs, with a batch size of 4096 and a learning rate of 1e 3 using the Adam optimizer (Kingma & Ba, 2017). A 3-layer MLP with a GELU activation and a hidden dimension of 256. (from Muller-Brown section). Also, Tables 4, 5, 6, and 7 provide detailed hyperparameters for other experiments, and text descriptions of training processes (e.g., 'path length of L=300', 'latent diffusion time τopt =3', '50,000 steps', 'cosine annealing schedule'). |