Multi-label Node Classification On Graph-Structured Data

Authors: Tianqi Zhao, Thi Ngan Dong, Alan Hanjalic, Megha Khosla

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we perform a large-scale comparative study with 8 methods and 9 datasets and analyse the performances of the methods to assess the progress made by current state of the art in the multi-label node classification scenario.
Researcher Affiliation Academia Tianqi Zhao EMAIL Department of Intelligent Systems Delft University of Technology Ngan Thi Dong EMAIL L3S Research Center, Hannover, Germany Alan Hanjalic EMAIL Delft University of Technology Megha Khosla EMAIL Delft University of Technology
Pseudocode No The paper describes methodologies and processes but does not include any clearly labeled pseudocode or algorithm blocks. It provides mathematical definitions and descriptions in prose.
Open Source Code Yes We release our benchmark at https://github.com/Tianqi-py/MLGNC. Our code is available at https://github.com/Tianqi-py/MLGNC.
Open Datasets Yes The first challenge in conducting focused studies on multi-label node classification is the limited number of publicly available multi-label graph datasets. Therefore, as our first contribution, we collect and release three real-world biological datasets and develop a multi-label graph generator to generate datasets with tunable properties. In particular, we curate 3 biological graph datasets using publicly available data. The detailed pre-processing steps and the original data sources are discussed in Appendix A.1.1, A.1.2, and A.1.3.
Dataset Splits Yes For all datasets except OGB-Proteins, Hum Loc, and Euk Loc we generate 3 random training, validation, and test splits with 60%, 20%, and 20% of the data. For OGBProteins, Hum Loc, and Euk Loc we follow the predefined data splits from (Hu et al., 2020), (Shen & Chou, 2007) and (Chou & Shen, 2007) respectively.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments. It only details models, hyperparameters, and datasets.
Software Dependencies No The paper provides hyperparameter settings in Appendix A.3 but does not explicitly list software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The training and hyperparameter settings for each model are summarized in the Appendix A.3 in tables 7 and 8. Table 7: The hyperparameter setting for Mlp and GNN baselines in this work for all datasets Table 8: The hyperparameter setting for Deep Walk in this work for all datasets