Cluster-guided Contrastive Class-imbalanced Graph Classification

Authors: Wei Ju, Zhengyang Mao, Siyu Yi, Yifang Qin, Yiyang Gu, Zhiping Xiao, Jianhao Shen, Ziyue Qiao, Ming Zhang

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world graph benchmark datasets verify the superior performance of our proposed method against competitive baselines. ... Experimental Settings Datasets. We evaluate the effectiveness of our proposed model by examining it on both synthetic and real-world datasets ... Experimental results We record the class-imbalanced classification accuracy of our C3GNN and the aforementioned baseline methods ... Ablation study In this part, we conducted comprehensive ablation studies on all six datasets to demonstrate the effectiveness of the proposed C3GNN. Hyper-parameter Sensitivity Here we investigate the sensitivity of our proposed C3GNN to hyper-parameters.
Researcher Affiliation Collaboration 1 College of Computer Science, Sichuan University, Chengdu, China 2 School of Computer Science, State Key Laboratory for Multimedia Information Processing, PKU-Anker LLM Lab, Peking University, Beijing, China 3 College of Mathematics, Sichuan University, Chengdu, China 4 Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA 5 Huawei Hisilicon, Shanghai, China 6School of Computing and Information Technology, Great Bay University, Dongguan, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Optimization Algorithm of C3GNN Input: Class-imbalanced graph dataset G = {Gi, yi}N i=1, time interval of updating cluster centers T, control parameter δ, and temperature parameter τ Output: Balanced classifier 1: Initialize GNN-based encoder parameter. 2: Train GNN in the first few epochs for warm-up. 3: while not done do 4: Adaptively update the cluster centers based on the current graph representations every T epochs. 5: Sample one augmentation from You et al. (2020). 6: Compute supervised contrastive loss Lintra i for intrasubclass by Eq. (6). 7: Compute supervised contrastive loss Linter i for intersubclass by Eq. (7). 8: Update GNN parameter by gradient descent to minimize L by Eq. (8). 9: end while
Open Source Code No The paper does not contain any explicit statement about releasing the source code for their proposed method (C3GNN) nor provides a link to a code repository.
Open Datasets Yes Datasets. We evaluate the effectiveness of our proposed model by examining it on both synthetic and real-world datasets from various domains. Specifically, the datasets are categorized into three groups: (a) synthetic: Synthie (Morris et al. 2016), (b) bioinformatics: ENZYMES (Schomburg et al. 2004), and (c) computer vision: MNIST (Dwivedi et al. 2020), Letter-high (Riesen and Bunke 2008), Letterlow (Riesen and Bunke 2008), and COIL-DEL (Riesen and Bunke 2008).
Dataset Splits Yes The dataset is split into training, validation, and testing sets with a 6:2:2 proportion for each respective set.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types).
Software Dependencies No The paper mentions using "Graph SAGE (Hamilton, Ying, and Leskovec 2017) as the GNN backbone encoder" and "Adam optimizer" but does not specify any version numbers for these or other software components like Python, PyTorch, or CUDA.
Experiment Setup Yes Implementation details. In our experiments, we utilized Graph SAGE (Hamilton, Ying, and Leskovec 2017) as the GNN backbone encoder with a two-layer MLP classifier. The models were optimized using the Adam optimizer with a fixed learning rate of 0.0001 and a batch size of 32. For our C3GNN, we set the temperature parameter τ to 0.2 and set the time interval T of dynamic updating the cluster centers to 10. Moreover, we fine-tuned the cluster size control parameter δ for each dataset individually.