reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional

Authors: Sanjeev Raja, Martin Sipka, Michael Psenka, Tobias Kreiman, Michal Pavelka, Aditi S. Krishnapriyan

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our approach on varied molecular systems, obtaining diverse, physically realistic transition pathways and generalizing beyond the pre-trained model s original training dataset. We validate our approach on systems of increasing complexity. Starting with the 2D M uller-Brown potential (Section 5.1), we build intuition and show accurate estimation of reaction rates and committor functions using our sampled transition paths. On alanine dipeptide (Section 5.2), our method recovers accurate free energy barriers and outperforms metadynamics and shooting algorithms in efficiency. For fast-folding proteins (Section 5.3), OM action minimization with diffusion or flow-matching models yields transition path ensembles closely aligned with reference MD at a fraction of the cost of traditional MD. Finally, we show that OM optimization with generative models trained on tetrapeptide configurations enables zero-shot TPS on new sequences (Section 5.4).
Researcher Affiliation	Academia	1Dept. of EECS, UC Berkeley 2Faculty of Mathematics and Physics, Charles University 3Done prior to joining Amazon 4Dept. of Chemical and Biomolecular Engineering, UC Berkeley 5Applied Mathematics and Computational Research, LBNL. Correspondence to: Sanjeev Raja, Aditi S. Krishnapriyan <EMAIL>.
Pseudocode	Yes	Algorithm 1 Onsager-Machlup Transition Path Optimization with Generative Models Algorithm 2 Initial Guess Path Generation with a Generative Model Algorithm 3 Initial Guess Path Generation with Iterative Unwrapping
Open Source Code	No	The text does not contain an explicit statement about the release of their source code or a link to a repository for the methodology described in this paper. It mentions using third-party tools like PyTorch and PDBFixer (https://github.com/openmm/pdbfixer) but not the authors' own implementation.
Open Datasets	Yes	We next consider proteins exhibiting fast dynamical transitions, for which millisecond-scale, reference MD simulations were performed in Lindorff-Larsen et al. (2011). We train a flow matching generative model on approximately 3,000 tetrapeptides simulated in Jing et al. (2024b).
Dataset Splits	Yes	We run 1,000 parallel simulations for 1,000 steps, yielding a total of 1 million datapoints. Of these, 800,000 are used for training, and 200,000 are reserved for validation. We perform OM optimization on 100 held-out tetrapeptide sequences not seen during training (using the same splits as in (Jing et al., 2024b)).
Hardware Specification	Yes	The entire optimization took on the order of hours on one NVIDIA RTX A6000 GPU, including the generation of initial trajectory. All experiments were performed on a single NVIDIA RTX A6000 GPU.
Software Dependencies	No	The paper mentions software components such as PyTorch, Amber ff14SB, Adam optimizer, FEniCS, Amber ff19SB, WHAM algorithm, PDBFixer, and OpenMM, but does not provide specific version numbers for any of these.
Experiment Setup	Yes	Table 3. Hyperparameters used for OM action optimization on the M uller-Brown potential with diffusion. The model is trained for 10 epochs, with a batch size of 4096 and a learning rate of 1e 3 using the Adam optimizer (Kingma & Ba, 2017). A 3-layer MLP with a GELU activation and a hidden dimension of 256. (from Muller-Brown section). Also, Tables 4, 5, 6, and 7 provide detailed hyperparameters for other experiments, and text descriptions of training processes (e.g., 'path length of L=300', 'latent diffusion time τopt =3', '50,000 steps', 'cosine annealing schedule').