Stiefel Flow Matching for Moment-Constrained Structure Elucidation

Authors: Austin H Cheng, Alston Lo, Kin Long Kelvin Lee, Santiago Miret, Alan Aspuru-Guzik

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS We evaluate Euclidean diffusion models and Stiefel Flow Matching on the QM9 and GEOM datasets. For each example, the model takes in moments and molecular formula and produces K = 10 samples. ... Table 1: Experimental results on QM9. Stiefel FM shows no violation of moment constraints as shown in the Error metrics, and has the highest success rate for structure elucidation, with the lowest computational cost.
Researcher Affiliation Collaboration Austin H. Cheng1,2 Alston Lo1,2 Kin Long Kelvin Lee3 Santiago Miret3 Alan Aspuru-Guzik1,2,4 1University of Toronto 2Vector Institute 3Intel Labs 4Acceleration Consortium
Pseudocode Yes Algorithm 1 Computing a Stiefel geodesic γ(t). (Edelman et al., 1998) Algorithm 2 Computing the Stiefel logarithm. (Zimmermann & H uper, 2022) Algorithm 3 Sampling under Stiefel Flow Matching. Algorithm 4 Heuristic alignment algorithm.
Open Source Code Yes https://github.com/aspuru-guzik-group/stiefel FM
Open Datasets Yes Datasets. For QM9 (Ramakrishnan et al., 2014), we use the conformers provided by the GEOM dataset. We abbreviate GEOM-Drugs (Axelrod & Gomez-Bombarelli, 2022) as GEOM.
Dataset Splits Yes QM9 has train/val/test splits of 104265/13056/13033 molecules, while GEOM has splits of 233625/29203/29203 molecules, or 5537598/29203/29203 conformers.
Hardware Specification Yes Models were trained on 4 NVIDIA A100 40GB GPUs.
Software Dependencies No The paper lists numerous software libraries in the acknowledgments (e.g., PyTorch, PyTorch Lightning, RDKit, NumPy, SciPy, pandas), but it only cites the papers or development teams associated with them, without providing specific version numbers for the software used in the experiments. For example, it lists "Py Torch (Paszke et al., 2019)" which refers to the paper describing PyTorch, not a specific version used.
Experiment Setup Yes Table 3: General training and sampling hyperparameters Hyperparameter QM9 GEOM Epochs 1000 60 Batch size per GPU 256 24 Optimizer Adam W Adam W Learning rate 10 4 10 4 Learning rate warmup steps 2000 2000 Weight decay 0.01 0.01 Gradient clipping yes yes EMA decay 0.9995 0.9995 KREED Timesteps 1000 1000 Schedule polynomial polynomial Stiefel FM Timesteps 200 200 Table 4: Training and sampling hyperparameters for Stiefel Flow Matching. Dataset Model Timestep sampling OT stochasticity γ Stiefel FM uniform no 0.00 Stiefel FM-OT uniform yes 0.00 Stiefel FM-OT-stoch uniform yes 0.10 Stiefel FM-ln logit-normal no 0.00 Stiefel FM-ln-OT logit-normal yes 0.00