Boltzmann priors for Implicit Transfer Operators

Authors: Juan Viguera Diez, Mathias Schreiner, Ola Engkvist, Simon Olsson

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that Bo PITO interpolators can recover approximate dynamics from models trained on biased simulations. For the Prinz Potential, we find that while ITO suffers from poor performance modeling long-term dynamics when data is scarce, Bo PITO models accurately capture long-term dynamics without worsening the performance on short and medium time-scales. Furthermore, the sections '5.2 BOLTZMANN PRIORS FOR TRAINING DATA GENERATION', '5.3 BOPITO EFFICIENTLY SAMPLES LONG-TERM DYNAMICS IN A LOW-DATA CONTEXT', and '5.4 INTERPOLATING BETWEEN MODELS TRAINED ON OFF-EQUILIBRIUM DATA AND THE BOLTZMANN DISTRIBUTION WITH EXPERIMENTAL DATA' all describe empirical evaluation and data analysis.
Researcher Affiliation Collaboration 1 Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg SE-41296 Gothenburg, Sweden. 2 Molecular AI, Discovery Sciences, R&D, Astra Zeneca Gothenburg, Pepparedsleden 1, 431 50 M olndal, Sweden.
Pseudocode Yes Algorithm 1 Training. Dis Exp is defined in Algorithm 4. Algorithm 2 Sampling from pθ(x0, N). Algorithm 3 Ancestral sampling. Algorithm 4 Sampling from Dis Exp.
Open Source Code Yes Code is available at https://github.com/olsson-group/bopito.
Open Datasets Yes Prinz potential is a 1D potential commonly used for benchmarking MD sampling methods (Prinz et al., 2011). We use publicly available data from Dibak et al. (2022), containing 1 µs simulation time split in 20 trajectories.
Dataset Splits Yes For different numbers of trajectories, n, we train 5000/n ITO models on n Prinz potential trajectories of length 150. Trajectories do not overlap among different trainings. Prinz potential 10 models are trained on non-overlapping trajectory sets for different numbers of trajectories. Alanine Dipeptide 10 models are trained on potentially overlapping random trajectory sets for different numbers of trajectories. Chignolin 4 models are trained on potentially overlapping random trajectory sets for different numbers of trajectories.
Hardware Specification No The computations in this work were enabled by the Berzelius resource provided by the Knut and Alice Wallenberg Foundation at the National Supercomputer Centre. This statement does not provide specific hardware details such as GPU or CPU models, memory, or processing speeds.
Software Dependencies No We generate trajectories using an Euler-Maruyama integrator using the library Deeptime (Hoffmann et al., 2021). The paper mentions the software library 'Deeptime' but does not specify its version number. No other software dependencies are listed with specific version numbers.
Experiment Setup Yes We report architectural and training hyper-parameters in Table 1. Table 1 includes: Diffusion steps (500, 1000), Noise schedule (Sigmoidal, Polynomial), Batch size (2,097,152, 1,024/32), Learning rate (0.001), Layers (3, 5), Embedding dimension (256), Net dimension (256), Optimizer (Adam), Inference ODE steps (50, 100/50).