reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scalable Equilibrium Sampling with Sequential Boltzmann Generators

Authors: Charlie B. Tan, Joey Bose, Chen Lin, Leon Klein, Michael M. Bronstein, Alexander Tong

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	SBG achieves state-of-the-art performance w.r.t. all metrics on peptide systems, demonstrating the first equilibrium sampling in Cartesian coordinates of tri-, tetraand hexa-peptides that were thus far intractable for prior Boltzmann generators. ...Empirically, we observe SBG to achieve state-of-the-art results across metrics, far outperforming continuous BGs on all datasets. ...We evaluate SBG and our baseline methods with quantitative metrics summarized in Table 2 and Table 3. Where is present three models are independently trained and sampled; unless otherwise stated 104 particles are sampled. We provide examples of SBG generated samples in Figure 2.
Researcher Affiliation	Collaboration	1University of Oxford 2Mila Québec AI Institute 3Freie Universität Berlin 4AITHYRA 5Université de Montréal.
Pseudocode	Yes	We state the full SBG sampling algorithm with adaptive resampling in Algorithm 1
Open Source Code	Yes	We open source our full codebase at https://github. com/charliebtan/transferable-samplers.
Open Datasets	Yes	For this dataset we use the data and data split from Klein & No e (2024). ... For the peptides composed of multiple alanine amino acids, we generate MD trajectories using the Open MM library (Eastman et al., 2017). ...For this dataset we use the same system setup as in Dibak et al. (2022)
Dataset Splits	Yes	For all datasets besides alanine dipeptide we use a training set of 105 contiguous samples (1 µs simulation time) from a single MCMC chain, a validation set of the next 2 104 contiguous samples (0.2 µs simulation time), and a test set of 104 uniformly strided subsamples from the remaining trajectory.
Hardware Specification	Yes	Left: GPU hours (NVIDIA L40S) for sampling and reweighting 104 points. For the sampling inference times in Figure 7, we compute all times on a single NVIDIA L40S GPU, using the maximum power of two batch size possible. For training times we compute all times on a single A100 80GB GPU except for SE(3)-EACF which is trained on a single H100.
Software Dependencies	No	The paper mentions
Experiment Setup	Yes	For inference we use a Dormand-Prince 45 (dopri5) adaptive step size solver with absolute tolerance 10 4 and relative tolerance 10 4. ...We use a learning rate of 1 10 4, weight decay of 4 10 4, Adam β1, β2 of (0.90, 0.95). We additionally employ the same cosine decay learning rate schedule with warmup (start and end learning rate 500 times lower than maximum value) and exponential moving average decay (0.999) used in ECNF++. Training is performed for 1000 epochs. ...For alanine systems we use 100 Langevin time steps, with ESSthreshold = 0.5 and ϵ = 1 10 5 up to trialanine, and ϵ = 1 10 6 thereafter. For chignolin we use 500 time steps, with ESSthreshold = 0.5 and ϵ = 1 10 5.