reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DeFoG: Discrete Flow Matching for Graph Generation

Authors: Yiming Qin, Manuel Madeira, Dorina Thanou, Pascal Frossard

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate state-of-the-art performance across synthetic, molecular, and digital pathology datasets, covering both unconditional and conditional generation settings.
Researcher Affiliation	Academia	1EPFL, Lausanne, Switzerland. Correspondence to: Yiming Qin <EMAIL>, Manuel Madeira <EMAIL>.
Pseudocode	Yes	Algorithm 1 De Fo G Training Algorithm 2 De Fo G Sampling
Open Source Code	Yes	1Code at github.com/manuelmlmadeira/De Fo G.
Open Datasets	Yes	We evaluate De Fo G using the widely adopted Planar, SBM (Martinkus et al., 2022), and Tree datasets (Bergmeister et al., 2023), along with the associated evaluation methodology. For the QM9 dataset, we follow the dataset split and evaluation metrics from Vignac et al. (2022), presenting the results in Appendix F.2.2, Tab. 8. For the larger MOSES and Guacamol datasets, we adhere to the training setup and evaluation metrics established by Polykovskiy et al. (2020) and Brown et al. (2019), respectively, with results in Tabs. 9 and 10. Lastly, we also include the ZINC250k dataset (Sterling & Irwin, 2015), which contains 249,455 molecules with up to 38 heavy atoms from 9 element types.
Dataset Splits	Yes	For the QM9 dataset, we follow the dataset split and evaluation metrics from Vignac et al. (2022). For the larger MOSES and Guacamol datasets, we adhere to the training setup and evaluation metrics established by Polykovskiy et al. (2020) and Brown et al. (2019), respectively, with results in Tabs. 9 and 10.
Hardware Specification	Yes	All the experiments in this work were run on a single NVIDIA A100-SXM4-80GB GPU.
Software Dependencies	No	The paper mentions using graph transformers and Relative Random Walk Probabilities (RRWP) but does not specify software versions for these tools or for the programming language and core libraries (e.g., Python, PyTorch versions).
Experiment Setup	Yes	In Tab. 5, we specifically highlight their values for the proposed training and sampling strategies (Sec. 3 and Appendix C.1), and conditional guidance parameter (see Appendix E). Table 5: Training and sampling parameters for full-step sampling (500 or 1000 steps for molecular and synthetic datasets, respectively).