reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

Authors: Angxiao Yue, Zichong Wang, Hongteng Xu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that Re QFlow achieves on-par performance in protein backbone generation while requiring much fewer sampling steps and significantly less inference time (e.g., being 37 faster than RFDiffusion and 63 faster than Genie2 when generating a backbone of length 300), demonstrating its effectiveness and efficiency.
Researcher Affiliation	Academia	1Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China 2School of Statistics, Renmin University of China, Beijing, China 3Beijing Key Laboratory of Research on Large Models and Intelligent Governance 4Engineering Research Center of Next-Generation Intelligent Search and Recommendation, MOE. Correspondence to: Hongteng Xu <EMAIL>.
Pseudocode	Yes	Algorithm 1 Training Procedure of QFlow Algorithm 2 Inference Algorithm 3 Training Procedure of Re QFlow
Open Source Code	Yes	Code is available at https: //github.com/Angxiao Yue/Re QFlow.
Open Datasets	Yes	We apply two commonly used datasets in our experiments. The first is the 23,366 protein backbones collected from Protein Data Bank (PDB) (Burley et al., 2023), whose lengths range from 60 to 512. The second is the SCOPe dataset (Chandonia et al., 2022) pre-processed by Frame Flow (Yim et al., 2023a), which contains 3,673 protein backbones with lengths ranging from 60 to 128.
Dataset Splits	No	The paper does not explicitly provide standard training/test/validation splits for the PDB or SCOPe datasets. It describes data filtering criteria and how the rectification dataset was generated, but not the primary splits for evaluation.
Hardware Specification	Yes	All the experiments are implemented on four NVIDIA A100 80G GPUs.
Software Dependencies	No	The paper mentions software tools like Protein MPNN, ESMFold, Foldseek, and DSSP but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Table 5. Training Hyperparameters Hyperparameters Value aux loss t pass (time threshold) PDB=0.5, SCOPe=0.25 aux loss weight 1.0 batch size 128 max num res squared PDB=1000000, SCOPe=500000 max epochs 1000 learning rate 0.0001 interpolant min t 0.01