reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Smooth and Expressive Interatomic Potentials for Physical Property Prediction

Authors: Xiang Fu, Brandon M Wood, Luis Barroso-Luque, Daniel S. Levine, Meng Gao, Misko Dzamba, C. Lawrence Zitnick

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The resulting model, e SEN, provides state-of-the-art results on a range of physical property prediction tasks, including materials stability prediction, thermal conductivity prediction, and phonon calculations. In this section, we evaluate e SEN in physical property prediction tasks: (1) materials stability prediction based on geometry optimization; (2) thermal conductivity prediction; and (3) phonon calculation.
Researcher Affiliation	Industry	1Fundamental AI Research (FAIR) at Meta. Correspondence to: Xiang Fu <EMAIL>, C. Lawrence Zitnick <EMAIL>.
Pseudocode	No	The paper describes methods in prose and provides architecture diagrams (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at: https://github.com/facebookresearch/fairchem. Pretrained checkpoints are available at https://huggingface.co/facebook/OMAT24.
Open Datasets	Yes	We trained e SEN models under the same hyperparameters while ablating one design choice at a time. We construct out-of-distribution (OOD) MD simulation tasks for both inorganic materials and organic molecules using models trained on the MPTrj (Jain et al., 2013; Deng et al., 2023) and the SPICE-MACE-OFF (Eastman et al., 2023; Kov acs et al., 2023) datasets. The Matbench-Discovery benchmark evaluates a model s ability to predict ground-state (0 K) thermodynamic stability through geometry optimization and energy prediction. It is a widely used benchmark for evaluating ML models in materials discovery.
Dataset Splits	Yes	Since MPTrj lacks an official test split and various models are typically trained on distinct subsets of the data, we randomly selected 5000 samples from the subsampled Alexandria (s Alex) dataset (Schmidt et al., 2024; Barroso-Luque et al., 2024) for a fair comparison. The e SEN-30M-OAM model starts from the e SEN-30M-OMat model, and was finetuned for 1 epoch on a dataset constructed by combining the s Alex training dataset and 8 copies of the MPTrj training dataset.
Hardware Specification	Yes	We benchmarked the inference speed of our models against the similar sized MACE-OFF-L (Kov acs et al., 2023) (4.7M) on a single 80GB Nvidia A-100 GPU.
Software Dependencies	Yes	All models are benchmarked using the standard Python (3.12) runtime with Pytorch v2.4.0 (Paszke et al., 2019) and CUDA 12.1 (Nickolls et al., 2008).
Experiment Setup	Yes	Hyper-parameters used for model training are shown in Table 6. We train all models using a per-atom energy MAE loss, a force l2 loss, and a stress MAE loss. For direct-force models or direct-force pre-training, we use the same decomposed loss as described in Barroso-Luque et al. 2024.