MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks
Authors: Nayoung Kim, Seongsu Kim, Minsu Kim, Jinkyoo Park, Sungsoo Ahn
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiment demonstrates that MOFFLOW accurately predicts MOF structures containing several hundred atoms, significantly outperforming conventional methods and state-of-the-art machine learning baselines while being much faster. ... We benchmark our algorithm with the MOF dataset compiled by Boyd et al. (2019), consisting of 324,426 structures. We compare with conventional and deep learning-based algorithms for crystal structure prediction. ... Table 1: Structure prediction accuracy. ... Figure 5: Property distributions. |
| Researcher Affiliation | Academia | Nayoung Kim1 Seongsu Kim2 Minsu Kim1 Jinkyoo Park1 Sungsoo Ahn1 1Korea Advanced Institute of Science and Technology (KAIST) 2Pohang University of Science and Technology (POSTECH) EMAIL {seongsukim}@postech.ac.kr |
| Pseudocode | Yes | Algorithm 1 MOFAttention module ... Algorithm 2 Node Update Module ... Algorithm 3 Edge Update Module ... Algorithm 4 Backbone Update Module |
| Open Source Code | Yes | Code available at https://github.com/nayoung10/MOFFlow. ... We provide our codes and model checkpoint in https://github.com/nayoung10/mofflow. |
| Open Datasets | Yes | We benchmark our algorithm with the MOF dataset compiled by Boyd et al. (2019), consisting of 324,426 structures. ... The dataset is distributed on the MATERIALSCLOUD. |
| Dataset Splits | Yes | After filtering structures with fewer than 200 blocks, we split the data into train/valid/test sets in an 8:1:1 ratio. ... The dataset is divided into train, valid and test in an 8:1:1 ratio. The statistic of data splits are represented in the Tables 4, 5 and 6. |
| Hardware Specification | Yes | To address memory constraints, we replaced fully connected edge construction with a radius graph (cutoff: 5 A) and used a batch size of 8. The model was trained on a 24GB NVIDIA RTX 3090 GPU for 5 days until convergence. ... Table 9: Computational resources. Diff CSP 24GB 3090 ( 1); MOFFLOW (Timestep Batch) 24GB 3090 ( 8); MOFFLOW (Batch) 24GB 3090 ( 8). |
| Software Dependencies | No | The paper mentions several software components (pymatgen, Zeo++, CHGnet, DiffDock, egnn, MOFDiff, protein-frame-flow) and states that their implementation is built upon some of them, but it does not provide specific version numbers for any of these dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | For Diff CSP, we follow the hyperparameters from Jiao et al. (2024a) and train for 200 epochs. Our method uses an Adam W optimizer (Loshchilov, 2017) with a learning rate of 10 4 and β = (0.9, 0.98). We use a maximum batch size of 160 and run inference with 50 integration steps... Table 8: Training hyperparameters of MOFFLOW. loss coefficient λ1 (q) 1.0, loss coefficient λ2 (τ) 2.0, loss coefficient λ3 (ℓ) 0.1, batch size 160, maximum N 2 1,600,000, optimizer Adam W, initial learning rate 0.0001, betas (0.9, 0.98), learning rate scheduler Reduce LROn Plateau, learning rate patience 30 epochs, learning rate factor 0.6. |