reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

An All-Atom Generative Model for Designing Protein Complexes

Authors: Ruizhe Chen, Dongyu Xue, Xiangxin Zhou, Zaixiang Zheng, Xiangxiang Zeng, Quanquan Gu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments related to general protein demonstrate that APM is capable of generating tightly binding protein complexes, as well as performing multi-chain protein folding and inverse folding tasks; Experiments in specific functional protein design tasks show that APM outperforms the SOTA baselines in antibody and peptide design with higher binding affinity.
Researcher Affiliation	Collaboration	1College of Computer Science and Electronic Engineering, Hunan University 2School of Artificial Intelligence, University of Chinese Academy of Sciences 3Byte Dance Seed (this work was done during Ruizhe Chen and Xiangxin Zhou s internship at Byte Dance Seed). Correspondence to: Quanquan Gu <EMAIL>.
Pseudocode	No	The paper describes the model architecture and training process in detail, and refers to algorithms from AlphaFold2's supplementary information for loss calculation, but it does not present any pseudocode or algorithm blocks within the main text.
Open Source Code	Yes	We released our code at https://github.com/bytedance/apm.
Open Datasets	Yes	Single-chain data is built from three sources: PDB (Berman et al., 2000), Swiss-Prot (Boeckmann et al., 2003), and AFDB (Varadi et al., 2022). For PDB samples, we followed the data processing flow in Multi Flow, resulting in 18684 samples. Multi-chain data is built from PDB Biological Assemblies (Rose et al., 2016). We use the Structural Antibody Database (Dunbar et al., 2014) and perform evaluation on the RAb D benchmark (Adolf-Bryfogle et al., 2018). We use the Pep Bench (Kong et al., 2024) dataset for training and validation and use the LNR (Tsaban et al., 2022) dataset as the test set.
Dataset Splits	Yes	We then select the clusters that do not contain complexes in RAb D benchmark (Adolf-Bryfogle et al., 2018) and split the complexes into training and validation sets with a ratio of 9:1 (1786 and 193 complexes respectively). The test set consists of 55 eligible complexes from the RAb D benchmark. In these tasks, we used samples missing cluster IDs that were dropped during training as the test set, and we also removed samples exceeding a length of 512. The final test set comprised 273 proteins with a number of chains of 2-6.
Hardware Specification	Yes	In training phase I, the Seq&BB Module was trained on 64 H100 GPUs with 257,000 steps, with a learning rate of 1e-4. The Sidechain Module was trained on 8 H100 GPUs, accumulating a total of 836,901 steps, also with a learning rate of 1e-4. In training phase II, APM was trained on 64 H100 GPUs with 235,000 steps. In the SFT phase, we used 8 H100 GPUs to fine-tune APM for antibody design and peptide design.
Software Dependencies	No	The paper mentions using 'ESM2-650M', 'py Rosetta', 'OpenMM', 'MMseqs2', and 'FAESM' (a drop-in replacement for ESM protein language model implementation) but does not provide specific version numbers for these software components to ensure reproducibility of the experiments.
Experiment Setup	Yes	In training phase I, the Seq&BB Module was trained on 64 H100 GPUs with 257,000 steps, with a learning rate of 1e-4. The Sidechain Module was trained on 8 H100 GPUs, accumulating a total of 836,901 steps, also with a learning rate of 1e-4. In training phase II, APM was trained on 64 H100 GPUs with 235,000 steps. The learning rate for the Seq&BB Module is set to 1e-5. In the SFT phase, we used 8 H100 GPUs to fine-tune APM for antibody design and peptide design. The SFT phase lasted for 1200 epochs for every task. The learning rate for each module is set to 5e-5 and the training cycle is adjusted to 10-1-1. The temperature T follows an exponential decay schedule: T = Tmax exp( λ t S) with hyperparameters Tmax = 30 and decay rate λ = 30.