reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

IgGM: A Generative Model for Functional Antibody and Nanobody Design

Authors: Rubo Wang, Fandi Wu, Xingyu Gao, Jiaxiang Wu, Peilin Zhao, Jianhua Yao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS We constructed our training, validation, and test sets from the SAb Dab database... Table 1: Complex structure prediction... Table 2: Results of the novel antibody design on SAb-2023H2-Ab... Ablation studies (Appendix E)
Researcher Affiliation	Collaboration	Rubo Wang1, 2, 3 , Fandi Wu3 , Xingyu Gao1, 2 , Jiaxiang Wu3, Peilin Zhao3, Jianhua Yao3 1Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3Tencent AI Lab, Shenzhen, China
Pseudocode	Yes	Algorithm 1 Ig GM Sampling Input: Model fθ( , ), sequence of time points τ1 > τ2 > > τN 1, initial noise (ˆs T , ˆx T ), antigen s A, x A) (s, x) fθ((ˆs T , ˆx T ), T, (s A, x A)) for n = 1 to N 1 do Sample Qz = Q1Q2...QT q(xt = j\|xt 1 = i), xz (N(0, I), Uniform(SO(3))) ˆsτn s Qz, ˆxτn x + p τ 2n ϵ2xz x fθ((ˆsτn, ˆxτn), τn, (s A, x A)) end for Output: (s, x) =0
Open Source Code	Yes	1Code is available at: https://github.com/Tencent AI4S/Ig GM
Open Datasets	Yes	We constructed our training, validation, and test sets from the SAb Dab database, employing the widely used method of dividing the dataset based on time, as previously established in other works (Jumper et al., 2021; Ruffolo et al., 2023; Wu et al., 2024; Abramson et al., 2024).
Dataset Splits	Yes	We constructed our training, validation, and test sets from the SAb Dab database... This process resulted in 101 validation samples and 60 test samples, both of which were completely unrelated to the training set.
Hardware Specification	Yes	This process lasted for 5 days on 8 A100 GPUs.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as programming languages or libraries, that would be needed to replicate the experimental environment.
Experiment Setup	Yes	we use the Adam (Loshchilov & Hutter, 2017) optimizer and set the batch size of the training process to 32. We also maintain an EMA (Exponential Moving Average) decay of 0.999 for the model parameters... In the model training process, we assigned probabilities of 4 : 2 : 2 : 2 for the model to design CDR H3, CDR H, all CDRs, and to refrain from sequence design, respectively.