Multi-objective antibody design with constrained preference optimization
Authors: Milong Ren, ZaiKai He, Haicang Zhang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluated on independent test sets, Ab Novo outperforms existing methods in metrics of binding affinity such as Rosetta binding energy and evolutionary plausibility, as well as in metrics for other biophysical properties like stability and specificity. |
| Researcher Affiliation | Academia | Milong Ren3,4, Zaikai He1,3,4, Haicang Zhang1,2 1Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine 2Central China Research Institute of Artificial Intelligence 3Institute of Computing Technology, Chinese Academy of Sciences 4University of Chinese Academy of Sciences Correspondence should be addressed to H. Zhang (EMAIL) |
| Pseudocode | Yes | Algorithm 1 Constrained Preference Optimization for Antibody Design |
| Open Source Code | Yes | CODE AVILIBILITY Code for Ab Novo can be found at https://github.com/Carbon Matrix Lab/Ab Novo. |
| Open Datasets | Yes | We trained Ab Novo using antibody-antigen complex structures derived from the SAb Dab database (Dunbar et al., 2014) and evaluated its performance on the RAb D test set, which is widely used for in silico antibody design. |
| Dataset Splits | No | The paper mentions training on 'SAb Dab database' and evaluating on 'RAb D test set', and describes a '40% sequence identity threshold on CDR-H3' to eliminate overlap between training and test sets. However, it does not specify explicit percentages or sample counts for training, validation, or test splits, nor does it provide citations to predefined splits with such details. |
| Hardware Specification | Yes | We use 8 Nvidia A100 (80G) for training, and the batch size is 128 for all training stages. |
| Software Dependencies | No | The paper mentions using several software and tools such as 'Rosetta software', 'Ig LM Shuai et al. (2021)', 'MMseqs', 'Alpha Fold2', and 'Adam optimizer'. However, it does not provide specific version numbers for any of these components. |
| Experiment Setup | Yes | Table 14: Hyper-parameter of Ab Novo. Stage Training objective Training steps Learning Rate Dataset Pre-trained LMLM + Ldistogram + Lcontact 200k 5e-5 AFDB (2M)+PDB(filter) Base model 1.0L(x) + 0.5L(r) + 0.2L(a) + 0.1Lviolation + 1.0Laux 20k 1e-4 Antigen-antibody complex Fine-tuning Lupdate policy (Equation 10) 20k 2e-5 Preference dataset We show information about the training process, objectives, and learning rate of Ab Novo in Table 14. Especially in the fine-tuning stage, for updating λ for one step, we update the policy for 100 steps. We show details about losses we used when updating policy and λ in 3.3.1 and 3.3.2. We use 8 Nvidia A100 (80G) for training, and the batch size is 128 for all training stages. For all training procedures, we use the Adam optimizer for training with default parameters. In practice, we choose [α(x), α(r), α(a)]=[1.0, 0.5, 0.2], α(sup) = 0.5 and K = 8. Meanwhile, we add the regularisation term α(R) for 1 t in Equation 9 to ensure the stability of training. Here, we choose α(R)/ t = 10.0. In our experiments, considering the different magnitudes of various rewards and constraints, we normalized all rewards and constraints during the training process. Meanwhile, we set the initial λ to [1.0, 1.0, 1.0] and the reward weights ω1 and ω2 to 1.0 and 1.0. |