reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SparseDiff: Sparse Discrete Diffusion for Scalable Graph Generation

Authors: Yiming QIN, Clement Vignac, Pascal Frossard

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that Sparse Diff consistently achieves state-of-the-art performance on both small graphs, including those with complex priors like molecular graphs, and large graph datasets. Notably, Sparse Diff handles much larger graphs (up to 2485 nodes), while the performance of Di Gress degrades on graphs with over 200 nodes. Specifically, Sparse Diff outperforms both dense models like SPECTRE (Martinkus et al., 2022) and Di Gress (Vignac et al., 2023a), as well as other scalable models like EDGE (Chen et al., 2023) and HGGT (Jang et al., 2023). Sparse Diff also converges four times faster than dense diffusion models on large graphs, such as social networks.
Researcher Affiliation	Academia	Yiming Qin EMAIL Ecole Polytechnique Fédérale de Lausanne (EPFL) Clément Vignac EMAIL Ecole Polytechnique Fédérale de Lausanne (EPFL) Pascal Frossard EMAIL Ecole Polytechnique Fédérale de Lausanne (EPFL)
Pseudocode	Yes	Algorithm 1 Sparse training at step t with the sparsity parameter λ (Section 3.1 & 3.2) ... Algorithm 2 Iterative inference at step t with the sparsity parameter λ (Section 3.3)
Open Source Code	Yes	1Codes available at https://github.com/qym7/Sparse Diff.
Open Datasets	Yes	We evaluate Sparse Diff on diverse graph datasets against a broad range of baselines, including Graph RNN (You et al., 2018), GRAN (Liao et al., 2019), Graph NVP (Madhawa et al., 2019), SPECTRE (Martinkus et al., 2022), GDSS (Jo et al., 2022), Di Gress (Vignac et al., 2023a), Dru M (Jo et al., 2023), and scalable models such as Bi GG (Dai et al., 2020), Graph ARM (Kong et al., 2023), EDGE (Chen et al., 2023), Hi Gen (Karami, 2023), and HGGT (Jang et al., 2023). We refer to the method from Bergmeister et al. (2023) as Graph LE. We report results as originally published to ensure fair and consistent comparison. ... The QM9 dataset (Wu et al., 2018) features molecules with up to 9 heavy atoms, while the MOSES benchmark (Polykovskiy et al., 2020), derived from ZINC Clean Leads, includes drug-sized molecules with extensive assessment tools.
Dataset Splits	Yes	As for dataset splits, we adhere to the framework established by Di Gress. Specifically, for the QM9 dataset, we implement a split comprising 100k molecules for training, 20k for validation, and 13k for evaluating likelihood on the test set. For the Planar, SBM, and Protein datasets, employing a seed of 1234, we randomly assign 20% of the graphs to testing, while 80% of the remaining graphs are utilized for training, and 20% for validation. For the Ego dataset, to ensure consistency with previous methods and a fair comparison, we maintain a split of 80% for training and 20% for testing, with 20% of the training set additionally used for validation purposes.
Hardware Specification	Yes	In our experimental setup, we utilize a single V100-32G GPU machine, which is particularly prone to scalability issues, to demonstrate that our method allows users with limited GPU resources to effectively train on larger graphs.
Software Dependencies	No	The paper mentions Pytorch Geometric (Fey & Lenssen, 2019) and RDKit, but does not provide specific version numbers for these or other software libraries used. While it states that 'All configuration details are comprehensively documented in the code provided', these details are not explicitly present in the main text of the paper.
Experiment Setup	Yes	Training setup The model is trained using a diffusion-based framework consisting of 1,000 denoising steps. To balance the objectives associated with node and edge predictions, fixed loss coefficients of 5 and 2 are applied to edge and node terms, respectively. The learning rate is set to a constant value of 0.0002 across all datasets. All experiments are conducted on a single V100 GPU with 32GB memory. The only introduced hyperparameter is the sparsity controller λ, which is selected based on graph size according to Tab. 5.