Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]

DIG: A Turnkey Library for Diving into Graph Deep Learning Research

Authors: Meng Liu, Youzhi Luo, Limei Wang, Yaochen Xie, Hao Yuan, Shurui Gui, Haiyang Yu, Zhao Xu, Jingtun Zhang, Yi Liu, Keqiang Yan, Haoran Liu, Cong Fu, Bora M Oztekin, Xuan Zhang, Shuiwang Ji

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To facilitate graph deep learning research, we introduce DIG: Dive into Graphs, a turnkey library that provides a unified testbed for higher level, research-oriented graph deep learning tasks. Currently, we consider graph generation, self-supervised learning on graphs, explainability of graph neural networks, and deep learning on 3D graphs. For each direction, we provide unified implementations of data interfaces, common algorithms, and evaluation metrics. Altogether, DIG is an extensible, open-source, and turnkey library for researchers to develop new methods and effortlessly compare with common baselines using widely used datasets and evaluation metrics. In addition, for APIs corresponding to advanced algorithms, we provide the benchmark examples, which can reproduce the experimental results reported in the original papers within reasonable or negligible differences.
Researcher Affiliation Academia Meng Liu EMAIL Youzhi Luo EMAIL Limei Wang EMAIL Yaochen Xie EMAIL Hao Yuan EMAIL Shurui Gui EMAIL Haiyang Yu EMAIL Zhao Xu EMAIL Jingtun Zhang EMAIL Yi Liu EMAIL Keqiang Yan EMAIL Haoran Liu EMAIL Cong Fu EMAIL Bora Oztekin EMAIL Xuan Zhang EMAIL Shuiwang Ji EMAIL Department of Computer Science and Engineering Texas A&M University College Station, TX 77843-3112, USA
Pseudocode No The paper describes a software library (DIG) and the types of algorithms it implements (e.g., JT-VAE, Info Graph), along with data interfaces and evaluation metrics. However, it does not contain any structured pseudocode or algorithm blocks detailing these methods within the paper itself.
Open Source Code Yes Source code is available at https://github.com/divelab/DIG.
Open Datasets Yes We implement data interfaces for widely used datasets. These are QM9 (Ramakrishnan et al., 2014), ZINC250k (Irwin et al., 2012), and MOSES (Polykovskiy et al., 2020)... We provide the data interfaces of TUDataset (i.e., NCI1, PROTEINS, etc.) (Morris et al., 2020) for graph-level classification tasks, and citation net- works (i.e., Cora, Cite Seer, and Pub Med) (Yang et al., 2016) for node-level classification tasks... For data interfaces, we consider the widely used synthetic datasets (i.e., BA-shapes, BA-Community, etc.) (Ying et al., 2019; Luo et al., 2020) and molecule datasets (i.e., BBBP, Tox21, etc.) (Wu et al., 2018)... We implement data interfaces for two benchmark datasets: QM9 (Ramakrishnan et al., 2014) and MD17 (Chmiela et al., 2017).
Dataset Splits No The paper mentions that DIG provides 'unified implementations of data interfaces, common algorithms, and evaluation metrics' and enables 'empirical comparisons with baselines using widely used datasets and evaluation metrics'. It also states 'benchmark examples, which can reproduce the experimental results reported in the original papers'. However, the paper does not explicitly detail the specific dataset splits (e.g., percentages, sample counts, or explicit splitting methodology) used for its own library's benchmarks or validation within its text.
Hardware Specification No The paper describes a software library, DIG, and its features. It does not provide any specific details about the hardware (e.g., GPU/CPU models, memory amounts) used for developing, testing, or running the library's benchmarks.
Software Dependencies No Our DIG is based on Python and Py Torch (Paszke et al., 2017). For some implementations, we also use Py G (Fey and Lenssen, 2019) and RDKit (Landrum et al., 2006) for basic operations on graphs and molecules. The paper mentions software names but does not provide specific version numbers for Python, PyTorch, PyG, or RDKit, which are necessary for reproducible software dependency information.
Experiment Setup No The paper introduces the DIG library and its functionalities, including data interfaces, algorithms, and evaluation metrics. It highlights that DIG provides 'benchmark examples, which can reproduce the experimental results reported in the original papers'. However, the paper itself does not specify concrete hyperparameters, training configurations, or system-level settings for any experiments conducted by the authors in the context of this paper to evaluate DIG.