reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Periodic Bayesian Flow for Material Generation

Authors: Hanlin Wu, Yuxuan Song, Jingjing Gong, Ziyao Cao, Yawen Ouyang, Jianbing Zhang, Hao Zhou, Wei-Ying Ma, Jingjing Liu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments over both crystal ab initio generation and crystal structure prediction tasks demonstrate the superiority of Crys BFN, which consistently achieves new state-of-the-art on all benchmarks. ... Ablation studies are detailed in Sec. 5.3 to validate design choices. ... We compare the sampling efficiency of Crys BFN and Diff CSP over the CSP task on the MP-20 dataset...
Researcher Affiliation	Academia	Hanlin Wu1,2 Yuxuan Song1,3 Jingjing Gong1 Ziyao Cao1,3 Yawen Ouyang1 Jianbing Zhang4 Hao Zhou1 Wei-Ying Ma1 Jingjing Liu1 1 Institute of AI Industry Research (AIR), Tsinghua University 2 School of Vehicle and Mobility, Tsinghua University 3 Dept. of Comp. Sci. & Tech., Tsinghua University 4 School of Artifcial Intelligence, Nanjing University EMAIL, EMAIL
Pseudocode	Yes	The training and sampling algorithm can be found in Algorithm 1 and Algorithm 2.
Open Source Code	Yes	Code is available at https://github.com/wu-han-lin/Crys BFN.
Open Datasets	Yes	Following Xie et al. (2021); Jiao et al. (2023), we choose the following datasets for evaluation: 1) Perov-5 (Castelli et al., 2012a;b)... 2) Carbon-24 (Pickard, 2020)... 3) MP-20 (Jain et al., 2013)... 4) MPTS-52 (Jiao et al., 2023)...
Dataset Splits	Yes	The procedure to split the datasets into training, validation, and testing subsets adheres to prior practices (Xie et al., 2021; Jiao et al., 2023).
Hardware Specification	Yes	All training experiments are conducted on a server with 8 NVidia RTX 3090 GPU, 64 Intel Xeon Platinum 8362 CPU and 256GB memory.
Software Dependencies	No	The paper mentions "Adam W optimizer" but does not specify its version or the versions of any other key software libraries or frameworks used for the implementation.
Experiment Setup	Yes	For the network, the CSPNet has 6 layers, 512 hidden states, 128 frequencies for the Fourier feature for each task and dataset following (Jiao et al., 2023). For BFN hyper-parameters, we set σ2 1 = 0.001 for continuous variable generation, β1 = 1000 for circular variables generation across all datasets and tasks. For discrete variables, we set β1 = 0.4 for the MP-20 dataset and β1 = 3.0 for the Perov-5 dataset. The number of steps is searched in {50, 100, 500, 1000, 2000}. For optimizations, we apply an Adam W optimizer with an initial learning rate 1 10 3 and a plateau scheduler with a decaying factor of 0.6, a patience of 100 epochs, and a minimal learning rate 1 10 4. The weight of every loss is 5 10 2. The network is trained for 4000, 5000, 1500, and 1000 epochs for Perov-5, Carbon-24, MP-20, and MPTS-52 respectively.