AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly
Authors: Hongyu Guo, Yoshua Bengio, Shengchao Liu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical validation on the benchmarking data CODCluster17 shows that Assemble Flow significantly outperforms six competitive deep learning baselines by at least 45% in assembly matching scores while maintaining 100% molecular integrity. Also, it matches the assembly performance of a widely used domain-specific simulation tool while reducing computational cost by 25-fold. We empirically evaluate Assemble Flow using the benchmarking crystallization dataset CODCluster17. The quantitative results reveal that Assemble Flow significantly outperforms six competitive deep learning baselines by at least 45% in terms of assembly matching score. Also, Assemble Flow exhibits strong assembly performance compared to a widely used domain-specific simulation tool for molecular assembly, achieving this with a 25-fold reduction in computational cost. Furthermore, we present qualitative results, including atomic collision properties of predicted crystals, which further demonstrate Assemble Flow s effectiveness in preserving and modeling the rigidity of the molecular crystallization and assembly process. |
| Researcher Affiliation | Academia | Hongyu Guo National Research Council Canada University of Ottawa EMAIL Yoshua Bengio Mila Québec AI Institute Université de Montréal CIFAR AI Chair EMAIL Shengchao Liu Université de Montréal EMAIL |
| Pseudocode | Yes | Note: the pseudo algorithm of our Assemble Flow is provided in Appendix E.3. A high-level overview and pseudo algorithm are provided in Algorithms 1 and 2 in Appendix E.3. |
| Open Source Code | Yes | The codes and checkpoints are available at this Git Hub repository. |
| Open Datasets | Yes | We evaluate our method using the crystallization dataset COD-Cluster17 (Liu et al., 2024c). This COD-Cluster17 is a curated subset derived from the Crystallography Open Database (COD) database (Grazulis et al., 2009). |
| Dataset Splits | No | We evaluate our method using the crystallization dataset COD-Cluster17 (Liu et al., 2024c). This COD-Cluster17 contains 133K crystals and is a curated subset derived from the Crystallography Open Database (COD) database (Grazulis et al., 2009). We consider three versions of COD-Cluster17, with 5k, 10k, and all data, respectively. |
| Hardware Specification | No | YB acknowledges support from NRC AI4D, CIFAR, and the CIFAR AI Chair program. This project s computational resources are provided by NRC and the Digital Research Alliance of Canada. |
| Software Dependencies | No | For each molecule in the cluster, we adopt the SE(3)-equivariant Pai NN (Schütt et al., 2021) to obtain the representation for each atom. ... The outputs include a molecular level predicted rotation velocity ˆqθ RM 3 and predicted translation velocity ˆxθ RM 3, where M is the number of molecules in the cluster. ... Optimization seed {0, 42, 123} epochs {1000, 2000} cutoff c {20, 50} learning rate {1e-4, 5e-4} optimizer {Adam } |
| Experiment Setup | Yes | We provide the key hyper-parameters of Assemble Flow in Table 6. Table 6: Hyperparameter specifications for Assemble Flow. Model Hyperparameter Value Intra-modeling Pai NN embedding dim {128} num of layers {3} cutoff {5} read out {mean} Intra-modeling Atomic Level num of layers {2,5} num of convolution {2} num of head {4, 8} num of timesteps {50, 200} α0 {1} α1 {1, 10} Intra-modeling Molecular Level num of layers {4,5} num of head {4, 8} num of timesteps {50, 200} α0 {1} α1 {1, 10} Optimization seed {0, 42, 123} epochs {1000, 2000} cutoff c {20, 50} learning rate {1e-4, 5e-4} optimizer {Adam } |