Synergy of GFlowNet and Protein Language Model Makes a Diverse Antibody Designer

Authors: Mingze Yin, Hanjing Zhou, Yiheng Zhu, Jialu Wu, Wei Wu, Mingyang Li, Kun Fu, Zheng Wang, Chang-Yu Hsieh, Tingjun Hou, Jian Wu

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate PGAb D on extensive antibody design benchmarks. It significantly outperforms existing methods in diversity (13.5% on Rab Dab, 31.1% on Sab Dab) while maintaining optimal developability and novelty. Generated antibodies are also found to form stable and regular 3D structures with their corresponding antigens, demonstrating the great potential of PGAb D to accelerate real-world antibody discovery.
Researcher Affiliation Collaboration 1College of Computer Science & Technology and Liangzhu Laboratory, Zhejiang University 2State Key Laboratory of Transvascular Implantation Devices of The Second Affiliated Hospital, Zhejiang University 3College of Pharmaceutical Sciences, Zhejiang University 4School of Artificial Intelligence and Data Science, University of Science and Technology of China 5Alibaba Cloud Computing 6Zhejiang Key Laboratory of Medical Imaging Artificial Intelligence
Pseudocode No The paper describes the methodology in narrative text and mathematical formulations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/KDurant-123/PG-Ab D.
Open Datasets Yes We select Rab Dab (Adolf Bryfogle et al. 2018) as the benchmark (60 antibody sequences for test without training data). Most of existing approaches need additional training data (usually use antibody sequences from Sab Dab database (Dunbar et al. 2014)).
Dataset Splits Yes We select Rab Dab (Adolf Bryfogle et al. 2018) as the benchmark (60 antibody sequences for test without training data).
Hardware Specification Yes Our model is trained upon Py Torch framework using 4 NVIDIA Ge Force RTX 4090 GPUs.
Software Dependencies No The paper mentions 'Py Torch framework' and 'Pro Gen2-base' as key components but does not provide specific version numbers for PyTorch or other software dependencies, which are necessary for reproducible dependency description.
Experiment Setup Yes We use learning rates of 1 10 4 and 5 10 3 to train Po E (i.e., reward function) and policy network of GFlow Net respectively, with both being updated via the Adam optimizer.