Calibrating and Improving Graph Contrastive Learning
Authors: MA KAILI, Garry YANG, Han Yang, Yongqiang Chen, James Cheng
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide both theoretical and empirical results to demonstrate the effectiveness of Contrast-Reg in enhancing the generalizability of the Graph Neural Network (GNN) model and improving the performance of graph contrastive algorithms with different similarity definitions and encoder backbones across various downstream tasks. Furthermore, we design experiments to examine the empirical performance of Contrast-Reg... We begin by introducing the experimental settings in Section 6.1. Section 6.2 presents the main results across various downstream tasks. |
| Researcher Affiliation | Academia | Kaili Ma* EMAIL Department of Computer Science and Engineering, The Chinese University of Hong Kong Garry Yang* EMAIL Department of Computer Science and Engineering, The Chinese University of Hong Kong Han Yang EMAIL Department of Computer Science and Engineering, The Chinese University of Hong Kong Yongqiang Chen EMAIL Department of Computer Science and Engineering, The Chinese University of Hong Kong James Cheng EMAIL Department of Computer Science and Engineering, The Chinese University of Hong Kong |
| Pseudocode | Yes | Algorithm 1: Graph Contrastive Learning Framework Algorithm 2: ML Parameter: Parameters of an (additional) GNN layer g. Algorithm 3: LC Hyperparameter: R: curriculum update epochs; k: the number of candidate positive samples for seed node; Algorithm 4: GCA Hyperparameter: two stochastic augmentation functions set T and T |
| Open Source Code | No | Our codes and datasets will be made available. |
| Open Datasets | Yes | The datasets we employ encompass citation networks, web graphs, co-purchase networks, and social networks. Comprehensive statistics for these datasets can be found in Appendix D. For Cora, Citeseer, Pubmed, ogbn-arxiv, ogbn-products, and Reddit, we adhere to the standard dataset splits and conduct 10 different runs with fixed random seeds ranging from 0 to 9. For Computers, Photo, and Wiki, we randomly divide the train/validation/test sets, allocating 20/30/all remaining nodes per class, in accordance with the recommendations in the previous literature (Shchur et al., 2018). ...Dataset statistics The datasets we employed encompass citation networks, web graphs, co-purchase networks, and social networks. The detailed dataset statistics are shown in Table 9. Dataset Node # Edge # Feature # Class # Cora (Yang et al., 2016) 2,708 5,429 1,433 7 Citeseer (Yang et al., 2016) 3,327 4,732 3,703 6 Pubmed (Yang et al., 2016) 19,717 44,338 500 3 ogbn-arxiv (Hu et al., 2020a) 169,343 1,166,243 128 40 Wiki (Yang et al., 2015) 2,405 17,981 4,973 3 Computers (Shchur et al., 2018) 13,381 245,778 767 10 Photo (Shchur et al., 2018) 7,487 119,043 745 8 ogbn-products (Hu et al., 2020a) 2,449,029 61,859,140 100 47 Reddit (Hamilton et al., 2017) 232,965 114,615,892 602 41 |
| Dataset Splits | Yes | For Cora, Citeseer, Pubmed, ogbn-arxiv, ogbn-products, and Reddit, we adhere to the standard dataset splits and conduct 10 different runs with fixed random seeds ranging from 0 to 9. For Computers, Photo, and Wiki, we randomly divide the train/validation/test sets, allocating 20/30/all remaining nodes per class, in accordance with the recommendations in the previous literature (Shchur et al., 2018). ...In order to circumvent the data linkage issue in link prediction, we employ an inductive setting for graph representation learning. We randomly extract induced subgraphs (comprising 85% of the edges) from each original graph for training both the representation learning model and the link predictor, while reserving the remaining edges for validation and testing (10% for the test edge set and 5% for the validation edge set). ...For the Reddit dataset, we naturally partition the data by time, pretraining the models using the first 20 days. We generate an induced subgraph based on the pretraining nodes and divide the remaining data into three parts: the first part produce a new subgraph for fine-tuning the pre-trained model and training the classifier, while the second and third parts are designated for validation and testing. For the ogbn-products dataset, we split the data according to node ID, pretraining the models using a subgraph generated by the initial 70% of the nodes. The data splitting scheme for the remaining data mirrors that of the Reddit dataset. |
| Hardware Specification | Yes | The experiments are conducted on Linux servers installed with an Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz, 256GB RAM and 8 NVIDIA 2080Ti GPUs. |
| Software Dependencies | Yes | Our models, as well as the DGI, GMI and GCN baselines, were implemented in Py Torch Geometric (Fey & Lenssen, 2019) version 1.4.3, DGL (Wang et al., 2019) version 0.5.1 with CUDA version 10.2, scikit-learn version 0.23.1 and Python 3.6. |
| Experiment Setup | Yes | For full batch training, we used 1-layer GCN as the encoder with prelu activation, for mini-batch training, we used a 3-layer GCN with prelu activation. We conducted grid search of different learning rate (from 1e-2, 5e-3, 3e-3, 1e-3, 5e-4, 3e-4, 1e-4) and curriculum settings (including learning rate decay and curriculum rounds) on the fullbatch version. We used 1e-3 or 5e-4 as the learning rate; 10,10,15 or 10,10,25 as the fanouts and 1024 or 512 as the batch size for mini-batch training. The hyperparameter configurations can be found in Appendix D. |