reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Diffusion on Graph: Augmentation of Graph Structure for Node Classification

Authors: Yancheng Wang, Changyu Liu, Yingzhen Yang

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various graph datasets for semi-supervised node classification and graph contrastive learning have been conducted to demonstrate the effectiveness of Do G with low-rank regularization. The code of Do G is available at https://github.com/Statistical-Deep-Learning/Do G.
Researcher Affiliation	Academia	Yancheng Wang EMAIL Changyu Liu EMAIL Yingzhen Yang EMAIL School of Computing and Augmented Intelligence Arizona State University
Pseudocode	Yes	Algorithm 1 and Algorithm 2 in Section D of the appendix describe the training algorithm of the Do G and the generation process of the augmented graph Gaug in details.
Open Source Code	Yes	The code of Do G is available at https://github.com/Statistical-Deep-Learning/Do G.
Open Datasets	Yes	We conduct experiments on five public benchmarks that are widely used for node classification on attributed graphs, namely Cora, Citeseer, Pubmed (Sen et al., 2008a), Coauthor CS, and ogbn-arxiv (Hu et al., 2020). Details on the statistics of the dataset are deferred to Table 6 in Section E.1 of the appendix.
Dataset Splits	Yes	For all our experiments, we follow the default separation (Shchur et al., 2018; Mernyei & Cangea, 2020; Hu et al., 2020) of training, validation, and test sets on each benchmark. ... We select the values of γ, τ, and β by performing 5-fold cross-validation on 20% of the training data in each dataset.
Hardware Specification	Yes	We perform all the experiments in our paper on one NVIDIA Tesla A100 GPU.
Software Dependencies	No	The paper mentions using optimizers (Adam, Adam W), MLPs, GAT, but does not provide specific version numbers for any libraries, programming languages (like Python), or frameworks (like PyTorch/TensorFlow/CUDA).
Experiment Setup	Yes	We use Adam optimizer with a learning rate of 0.001 for the training of the GAE. The weight decay is set to 1 10 5. ... We use the Adam W optimizer to optimize the LDM with a learning rate of 0.0002 and a weight decay factor of 0.0001. ... The guidance strength of CFG is set to 0.5 in our experiments. A three-layer Multilayer Perceptron (MLP) is used as the denoising model in the LDM, whose hidden dimension is set to 512. We train LDM for 3000 epochs and keep track of the exponential moving average (EMA) of the model during the training with a decay factor of 0.995. ... The value of γ is selected from {0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}. The value of τ is selected from {0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5}. The value of β is selected from {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.