reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks

Authors: Nayoung Kim, Seongsu Kim, Minsu Kim, Jinkyoo Park, Sungsoo Ahn

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiment demonstrates that MOFFLOW accurately predicts MOF structures containing several hundred atoms, significantly outperforming conventional methods and state-of-the-art machine learning baselines while being much faster. ... We benchmark our algorithm with the MOF dataset compiled by Boyd et al. (2019), consisting of 324,426 structures. We compare with conventional and deep learning-based algorithms for crystal structure prediction. ... Table 1: Structure prediction accuracy. ... Figure 5: Property distributions.
Researcher Affiliation	Academia	Nayoung Kim1 Seongsu Kim2 Minsu Kim1 Jinkyoo Park1 Sungsoo Ahn1 1Korea Advanced Institute of Science and Technology (KAIST) 2Pohang University of Science and Technology (POSTECH) EMAIL {seongsukim}@postech.ac.kr
Pseudocode	Yes	Algorithm 1 MOFAttention module ... Algorithm 2 Node Update Module ... Algorithm 3 Edge Update Module ... Algorithm 4 Backbone Update Module
Open Source Code	Yes	Code available at https://github.com/nayoung10/MOFFlow. ... We provide our codes and model checkpoint in https://github.com/nayoung10/mofflow.
Open Datasets	Yes	We benchmark our algorithm with the MOF dataset compiled by Boyd et al. (2019), consisting of 324,426 structures. ... The dataset is distributed on the MATERIALSCLOUD.
Dataset Splits	Yes	After filtering structures with fewer than 200 blocks, we split the data into train/valid/test sets in an 8:1:1 ratio. ... The dataset is divided into train, valid and test in an 8:1:1 ratio. The statistic of data splits are represented in the Tables 4, 5 and 6.
Hardware Specification	Yes	To address memory constraints, we replaced fully connected edge construction with a radius graph (cutoff: 5 A) and used a batch size of 8. The model was trained on a 24GB NVIDIA RTX 3090 GPU for 5 days until convergence. ... Table 9: Computational resources. Diff CSP 24GB 3090 ( 1); MOFFLOW (Timestep Batch) 24GB 3090 ( 8); MOFFLOW (Batch) 24GB 3090 ( 8).
Software Dependencies	No	The paper mentions several software components (pymatgen, Zeo++, CHGnet, DiffDock, egnn, MOFDiff, protein-frame-flow) and states that their implementation is built upon some of them, but it does not provide specific version numbers for any of these dependencies, which is required for reproducibility.
Experiment Setup	Yes	For Diff CSP, we follow the hyperparameters from Jiao et al. (2024a) and train for 200 epochs. Our method uses an Adam W optimizer (Loshchilov, 2017) with a learning rate of 10 4 and β = (0.9, 0.98). We use a maximum batch size of 160 and run inference with 50 integration steps... Table 8: Training hyperparameters of MOFFLOW. loss coefficient λ1 (q) 1.0, loss coefficient λ2 (τ) 2.0, loss coefficient λ3 (ℓ) 0.1, batch size 160, maximum N 2 1,600,000, optimizer Adam W, initial learning rate 0.0001, betas (0.9, 0.98), learning rate scheduler Reduce LROn Plateau, learning rate patience 30 epochs, learning rate factor 0.6.