DeFoG: Discrete Flow Matching for Graph Generation
Authors: Yiming Qin, Manuel Madeira, Dorina Thanou, Pascal Frossard
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate state-of-the-art performance across synthetic, molecular, and digital pathology datasets, covering both unconditional and conditional generation settings. |
| Researcher Affiliation | Academia | 1EPFL, Lausanne, Switzerland. Correspondence to: Yiming Qin <EMAIL>, Manuel Madeira <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 De Fo G Training Algorithm 2 De Fo G Sampling |
| Open Source Code | Yes | 1Code at github.com/manuelmlmadeira/De Fo G. |
| Open Datasets | Yes | We evaluate De Fo G using the widely adopted Planar, SBM (Martinkus et al., 2022), and Tree datasets (Bergmeister et al., 2023), along with the associated evaluation methodology. For the QM9 dataset, we follow the dataset split and evaluation metrics from Vignac et al. (2022), presenting the results in Appendix F.2.2, Tab. 8. For the larger MOSES and Guacamol datasets, we adhere to the training setup and evaluation metrics established by Polykovskiy et al. (2020) and Brown et al. (2019), respectively, with results in Tabs. 9 and 10. Lastly, we also include the ZINC250k dataset (Sterling & Irwin, 2015), which contains 249,455 molecules with up to 38 heavy atoms from 9 element types. |
| Dataset Splits | Yes | For the QM9 dataset, we follow the dataset split and evaluation metrics from Vignac et al. (2022). For the larger MOSES and Guacamol datasets, we adhere to the training setup and evaluation metrics established by Polykovskiy et al. (2020) and Brown et al. (2019), respectively, with results in Tabs. 9 and 10. |
| Hardware Specification | Yes | All the experiments in this work were run on a single NVIDIA A100-SXM4-80GB GPU. |
| Software Dependencies | No | The paper mentions using graph transformers and Relative Random Walk Probabilities (RRWP) but does not specify software versions for these tools or for the programming language and core libraries (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | In Tab. 5, we specifically highlight their values for the proposed training and sampling strategies (Sec. 3 and Appendix C.1), and conditional guidance parameter (see Appendix E). Table 5: Training and sampling parameters for full-step sampling (500 or 1000 steps for molecular and synthetic datasets, respectively). |