reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Aligning Protein Conformation Ensemble Generation with Physical Feedback

Authors: Jiarui Lu, Xiaoyin Chen, Stephen Zhewen Lu, Aurelie Lozano, Vijil Chenthamarakshan, Payel Das, Jian Tang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on the MD ensemble benchmark demonstrate that EBA achieves state-of-the-art performance in generating high-quality protein ensembles. By improving the physical plausibility of generated structures, our approach enhances model predictions and holds promise for applications in structural biology and drug discovery.
Researcher Affiliation	Collaboration	1Mila Qu ebec AI Institute 2Universit e de Montr eal 3Mc Gill University 4IBM Research 5HEC Montr eal 6CIFAR AI Chair.
Pseudocode	Yes	Algorithm 1 Fine-tuning Diffusion Model with EBA Algorithm 2 Inference of Diffusion Module (Algo. 18 in Abramson et al. (2024)) Algorithm 3 Structure Rigid Align (Kabsch-Umeyama Algorithm)
Open Source Code	No	The paper acknowledges the authors of Protenix for open-sourcing their code, but does not state that the code for the methodology described in this paper (EBA) is open-source or provide a link to it.
Open Datasets	Yes	To demonstrate the effectiveness of the proposed fine-tuning pipeline, we evaluate the protein ensemble generation task on the ATLAS dataset (Vander Meersche et al., 2024) following the benchmark in Jing et al. (2024a).
Dataset Splits	Yes	We strictly follow the experimental settings as well as the data split in Jing et al. (2024a) and sample 250 predictions per test target using different models. Specifically, we download the ATLAS MD trajectories, which comprises 1,390 proteins selected for structural diversity based on ECOD domain classification. This results in train / validation / test splits of 1,266 / 39 / 82 MD ensembles, with the rest excluded due to excessive sequence length (Jing et al., 2024a).
Hardware Specification	Yes	The training was conducted with NVIDIA A100 GPUs. ... Similarly, experiments were run on NVIDIA A100 GPUs. ... The DDP parallelism and 4 A100 40GB GPUs are used for this benchmarking.
Software Dependencies	No	The paper mentions using 'Protenix' as AlphaFold3 architecture implementation, 'Open MM suite', 'CHARMM36', 'GBn2 model', and 'scipy.spatial.transform.Rotation'. However, it does not provide specific version numbers for these software components, which is required for reproducibility.
Experiment Setup	Yes	The parameter optimization is performed with the Adam optimizer (Kingma & Ba, 2014), using a learning rate of 0.001, β values of (0.9, 0.95), and a weight decay of 1 10 8. The learning rate follows an Exponential LR schedule with a warm-up phase of 200 steps and a decay factor γ of 0.95 applied every 50k optimizer steps. We set λMSE = 1.0 and λLDDT = 1.0 in Eq. (14). During training the noise level is sampled from σdatae 1.2+1.5 N (0,1) as the default setting with σdata = 16. In each optimizer step, we clip the gradient norm by 10. The SFT process consists of two stages: in the first stage, input structures with more than 384 residues are randomly cropped to a fixed size of 384. ... Random rigid augmentation is applied during diffusion training with an internal diffusion batch size of 32. In the second stage, the cropping size is increased to 768, and random rigid augmentation is applied with a reduced internal batch size of 16. ... For EBA, we follow the same optimizer and scheduler as SFT stage but use a smaller base learning rate of 1.0 10 7. To reduce the variance of gradient during training, we accumulate the gradient per 16 steps and also clip the norm by 10. We set the energy temperature factor β = 1/L0.5 where L is the protein length of the current mini-batch of samples, and the combined model temperature factor αT = 50. The (internal) diffusion batch size is set to be 8 during alignment. ... The inference noise scheduler also has the same configuration as Abramson et al. (2024), i.e., smax = 160.0, smin = 4 10 4, p = 7 and σdata = 16.