reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Boltzmann priors for Implicit Transfer Operators

Authors: Juan Viguera Diez, Mathias Schreiner, Ola Engkvist, Simon Olsson

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that Bo PITO interpolators can recover approximate dynamics from models trained on biased simulations. For the Prinz Potential, we find that while ITO suffers from poor performance modeling long-term dynamics when data is scarce, Bo PITO models accurately capture long-term dynamics without worsening the performance on short and medium time-scales. Furthermore, the sections '5.2 BOLTZMANN PRIORS FOR TRAINING DATA GENERATION', '5.3 BOPITO EFFICIENTLY SAMPLES LONG-TERM DYNAMICS IN A LOW-DATA CONTEXT', and '5.4 INTERPOLATING BETWEEN MODELS TRAINED ON OFF-EQUILIBRIUM DATA AND THE BOLTZMANN DISTRIBUTION WITH EXPERIMENTAL DATA' all describe empirical evaluation and data analysis.
Researcher Affiliation	Collaboration	1 Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg SE-41296 Gothenburg, Sweden. 2 Molecular AI, Discovery Sciences, R&D, Astra Zeneca Gothenburg, Pepparedsleden 1, 431 50 M olndal, Sweden.
Pseudocode	Yes	Algorithm 1 Training. Dis Exp is defined in Algorithm 4. Algorithm 2 Sampling from pθ(x0, N). Algorithm 3 Ancestral sampling. Algorithm 4 Sampling from Dis Exp.
Open Source Code	Yes	Code is available at https://github.com/olsson-group/bopito.
Open Datasets	Yes	Prinz potential is a 1D potential commonly used for benchmarking MD sampling methods (Prinz et al., 2011). We use publicly available data from Dibak et al. (2022), containing 1 µs simulation time split in 20 trajectories.
Dataset Splits	Yes	For different numbers of trajectories, n, we train 5000/n ITO models on n Prinz potential trajectories of length 150. Trajectories do not overlap among different trainings. Prinz potential 10 models are trained on non-overlapping trajectory sets for different numbers of trajectories. Alanine Dipeptide 10 models are trained on potentially overlapping random trajectory sets for different numbers of trajectories. Chignolin 4 models are trained on potentially overlapping random trajectory sets for different numbers of trajectories.
Hardware Specification	No	The computations in this work were enabled by the Berzelius resource provided by the Knut and Alice Wallenberg Foundation at the National Supercomputer Centre. This statement does not provide specific hardware details such as GPU or CPU models, memory, or processing speeds.
Software Dependencies	No	We generate trajectories using an Euler-Maruyama integrator using the library Deeptime (Hoffmann et al., 2021). The paper mentions the software library 'Deeptime' but does not specify its version number. No other software dependencies are listed with specific version numbers.
Experiment Setup	Yes	We report architectural and training hyper-parameters in Table 1. Table 1 includes: Diffusion steps (500, 1000), Noise schedule (Sigmoidal, Polynomial), Batch size (2,097,152, 1,024/32), Learning rate (0.001), Layers (3, 5), Embedding dimension (256), Net dimension (256), Optimizer (Adam), Inference ODE steps (50, 100/50).