Compositional Flows for 3D Molecule and Synthesis Pathway Co-design
Authors: Tony Shen, Seonghwan Seo, Ross Irwin, Kieran Didi, Simon Olsson, Woo Youn Kim, Martin Ester
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we evaluate 3DSynth Flow on the task of designing synthesizable drugs directly within protein pockets. 3DSynth Flow achieves 4.2 sampling efficiency improvement, and state-of-the-art performance across all 15 targets in the LIT-PCBA benchmark (Tran-Nguyen et al., 2020) for binding affinity. 3DSynth Flow can be extended to pocket-conditional setting and achieve state of-art-performance in both Vina Dock (-9.42) and Ai Zynth success rate (36.1%) on the Cross Docked benchmark. |
| Researcher Affiliation | Collaboration | 1School of Computing Science, Simon Fraser University, Canada 2Department of Chemistry, KAIST, Republic of Korea 3Molecular AI, Discovery Sciences, R&D, Astra Zeneca 4Department of Computer Science and Engineering, Chalmers University of Technology, Sweden 5Department of Computer Science, University of Oxford, UK 6NVIDIA. Correspondence to: Tony Shen <EMAIL>. |
| Pseudocode | Yes | A.2. Sampling Algorithm Algorithm 1 Compositional Flow Sampling with CGFlow Require: step size t, interval λ, time window twindow 1: init t = 0, i = 0, Ct = C0, S0 = S0, ˆSt 1 = S0 2: while t < 1 do 3: xt (Ct, St, ˆSt 1) Use self-conditioning. 4: if t mod λ = 0 then 5: i i + 1 6: C(i) πθ(xt) Compositional flow model: Sample the next compositional component i. 7: S(i) t N(0, 1) Initialize new state values for component i under the fixed seed 8: xt T(xt, C(i), S(i) t ) Transition with sampled composition and state value. 9: t(i) gen t 10: end if 11: t(j) local clip t t(j) gen twindow 12: ˆS(j) 1 pθ 1|t(xt, t(j) local), j i State flow model: Predict final state values. 13: S(j) t+ t S(j) t + ( ˆS(j) 1 S(j) t ) κ(j) t, j i Step according to the predicted vector field. 14: t t + t 15: end while 16: return x1 |
| Open Source Code | Yes | A. CGFlow details All code is provided in supplementary material. |
| Open Datasets | Yes | To answer these questions, We first apply 3DSynth Flow to optimize for targets in the LIT-PCBA benchmark (Tran Nguyen et al., 2020) ... Lastly, we comapre 3DSynth Flow against SBDD methods in the pocket-conditional setting on the Cross Docked dataset (Francoeur et al., 2020). |
| Dataset Splits | Yes | We apply the splitting and processing protocol on the Cross Docked dataset to obtain the same training split of protein-ligand pairs as previous methods (Luo et al., 2021; Peng et al., 2022). We use 99,000 complexes as the training set and remaining 1,000 complexes as the validation set. |
| Hardware Specification | Yes | The state flow model (i.e., the pocket-conditional pose predictor) was trained for 30 epochs on 8 A4000 GPUs (16GB), requiring a total of 26 hours. ... In contrast, the composition flow model is trained for 1,000 iterations under 20 flow matching steps. Training with GPU-accelerated docking (Uni Dock; balance mode) takes between 7 and 12 hours (depending on the target) on a single A4000 GPU (16GB), making the computational requirement accessible for most practical drug discovery campaigns. |
| Software Dependencies | No | Explanation: The paper mentions specific tools like 'Uni Dock' and 'Ai Zynth Finder', and references the 'Semla architecture' with a GitHub link, but it does not provide specific version numbers for any of the software dependencies used in its implementation (e.g., Python, PyTorch, RDKit, or specific versions of Uni Dock or Ai Zynth Finder). |
| Experiment Setup | Yes | Table 5. Default hyperparameters used for compositional structure. ... Table 6. Default hyperparameters used for State flow model. ... Table 7. Default hyperparameters used in all GFlow Nets. ... Table 8. Specific hyperparameters used in 3DSynth Flow training. |