Periodic Materials Generation using Text-Guided Joint Diffusion Model
Authors: KISHALAY DAS, Subhojyoti Khastagir, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments using popular datasets on benchmark tasks reveal that TGDMat outperforms existing baseline methods by a good margin. Notably, for the structure prediction task, with just one generated sample, TGDMat outperforms all baseline models, highlighting the importance of text-guided diffusion. Further, in the generation task, TGDMat surpasses all baselines and their text-fusion variants, showcasing the effectiveness of the joint diffusion paradigm. |
| Researcher Affiliation | Academia | 1 Indian Institute of Technology, Kharagpur, India 2 Indo Korea Science and Technology Center, Bangalore, India |
| Pseudocode | Yes | Algorithm 1 Training Algorithm Algorithm 2 Sampling Algorithm |
| Open Source Code | Yes | Code is available at https://github.com/kdmsit/TGDMat |
| Open Datasets | Yes | Perov-5 (Castelli et al., 2012a;b), Carbon-24 (Pickard., 2020) and MP-20 (Jain et al., 2013b) |
| Dataset Splits | Yes | While training TGDMat, we split the datasets into the train, test, and validation sets following the convention of 60:20:20 (Xie et al., 2021). |
| Hardware Specification | Yes | We perform all the experiments in the Tesla P100-PCIE-16GB GPU server. |
| Software Dependencies | No | The paper mentions tools and models like Mat Sci BERT and Pymatgen but does not provide specific version numbers for general software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | In our TGDMat model, we adopted 4 layers CSPNet as message passing layer with hidden dimension set as 512. Further, we use pre-trained Mat Sci BERT (Gupta et al., 2022) followed by a two-layer projection layer (projection dimension 64) as the text encoder module. We keep the dimension of time embedding at each diffusion timestep as 64. We train it for 500 epochs using the same optimizer, and learning rate scheduler as Diff CSP and keep the batch size as 512. |