Stiefel Flow Matching for Moment-Constrained Structure Elucidation
Authors: Austin H Cheng, Alston Lo, Kin Long Kelvin Lee, Santiago Miret, Alan Aspuru-Guzik
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS We evaluate Euclidean diffusion models and Stiefel Flow Matching on the QM9 and GEOM datasets. For each example, the model takes in moments and molecular formula and produces K = 10 samples. ... Table 1: Experimental results on QM9. Stiefel FM shows no violation of moment constraints as shown in the Error metrics, and has the highest success rate for structure elucidation, with the lowest computational cost. |
| Researcher Affiliation | Collaboration | Austin H. Cheng1,2 Alston Lo1,2 Kin Long Kelvin Lee3 Santiago Miret3 Alan Aspuru-Guzik1,2,4 1University of Toronto 2Vector Institute 3Intel Labs 4Acceleration Consortium |
| Pseudocode | Yes | Algorithm 1 Computing a Stiefel geodesic γ(t). (Edelman et al., 1998) Algorithm 2 Computing the Stiefel logarithm. (Zimmermann & H uper, 2022) Algorithm 3 Sampling under Stiefel Flow Matching. Algorithm 4 Heuristic alignment algorithm. |
| Open Source Code | Yes | https://github.com/aspuru-guzik-group/stiefel FM |
| Open Datasets | Yes | Datasets. For QM9 (Ramakrishnan et al., 2014), we use the conformers provided by the GEOM dataset. We abbreviate GEOM-Drugs (Axelrod & Gomez-Bombarelli, 2022) as GEOM. |
| Dataset Splits | Yes | QM9 has train/val/test splits of 104265/13056/13033 molecules, while GEOM has splits of 233625/29203/29203 molecules, or 5537598/29203/29203 conformers. |
| Hardware Specification | Yes | Models were trained on 4 NVIDIA A100 40GB GPUs. |
| Software Dependencies | No | The paper lists numerous software libraries in the acknowledgments (e.g., PyTorch, PyTorch Lightning, RDKit, NumPy, SciPy, pandas), but it only cites the papers or development teams associated with them, without providing specific version numbers for the software used in the experiments. For example, it lists "Py Torch (Paszke et al., 2019)" which refers to the paper describing PyTorch, not a specific version used. |
| Experiment Setup | Yes | Table 3: General training and sampling hyperparameters Hyperparameter QM9 GEOM Epochs 1000 60 Batch size per GPU 256 24 Optimizer Adam W Adam W Learning rate 10 4 10 4 Learning rate warmup steps 2000 2000 Weight decay 0.01 0.01 Gradient clipping yes yes EMA decay 0.9995 0.9995 KREED Timesteps 1000 1000 Schedule polynomial polynomial Stiefel FM Timesteps 200 200 Table 4: Training and sampling hyperparameters for Stiefel Flow Matching. Dataset Model Timestep sampling OT stochasticity γ Stiefel FM uniform no 0.00 Stiefel FM-OT uniform yes 0.00 Stiefel FM-OT-stoch uniform yes 0.10 Stiefel FM-ln logit-normal no 0.00 Stiefel FM-ln-OT logit-normal yes 0.00 |