RIGNN: A Rationale Perspective for Semi-supervised Open-world Graph Classification

Authors: Xiao Luo, Yusheng Zhao, Zhengyang Mao, Yifang Qin, Wei Ju, Ming Zhang, Yizhou Sun

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify our proposed methods on four benchmark datasets in various settings and experimental results reveal the effectiveness of our proposed RIGNN compared with state-of-the-art methods. ... In this section, we conduct various experiments on six datasets to validate the effectiveness of our RIGNN. The experimental results show the superiority of RIGNN in both open-world and open-set graph classification settings. ... 5 Experiments
Researcher Affiliation Academia 1Department of Computer Science, University of California, Los Angeles 2School of Computer Science, Peking University
Pseudocode Yes A Algorithm The algorithm of our RIGNN is summarized as below. Algorithm 1: Training Algorithm of RIGNN
Open Source Code No The paper does not provide an explicit statement about releasing source code, nor does it include a link to a code repository.
Open Datasets Yes We utilize four public benchmark graph datasets, i.e., COILDEL, Letter-high, MNIST, CIFAR10, REDDIT and COLORS-3 in our experiments. Their statistics are presented in Table 4.
Dataset Splits Yes We create two scenarios indicating different labeling ratios and denote them as Easy (a higher labeling ratio) and Hard (a lower labeling ratio), respectively. In particular, the ratio for Easy/Hard problems is 0.8/0.5, 0.4/0.2, 0.03/0.01, 0.07/0.03, 0.7/0.3 and 0.8/0.3 for COIL-DEL, Letter-High, MNIST, CIFAR10, REDDIT and COLORS-3, respectively (Luo et al., 2023).
Hardware Specification Yes We implement the proposed RIGNN with Py Torch and train all the models with an NVIDIA RTX GPU.
Software Dependencies No The paper mentions "Py Torch" but does not specify a version number. No other specific software dependencies with version numbers are provided.
Experiment Setup Yes As for hyperparameters, we set k in the graph-of-graph construction process to 2. For the weight λ in the loss function, we set it to 0.1. Their detailed analysis can be found in Section C. The dimension of all hidden features is set to 128. As for the network architecture, we use a two-layer Graph SAGE (Hamilton et al., 2017) to construct the relational detector fθ and a three-layer GIN convolution for the feature extractor gθ. In the middle of the convolutional layer, we implement graph pooling with Top K Pooling (Gao & Ji, 2019b) as default. For the Jensen-Shannon mutual information estimator Tγ, we concatenate the two inputs and send the feature to a two-layer MLP. A two-layer MLP is also adopted from the classifier hϕ. For the model training, we train the model for 100 epochs in total and utilize the entire dataset for estimating the mutual information. In terms of optimization, we employ the gradient reversal layer (Ganin & Lempitsky, 2015) to realize the adversarial training of γ. This approach provides a mechanism to avoid alternative optimization, and ensure the overall training stability. The parameters θ and ϕ are viewed as an integrated unit during the training process. This design choice effectively mitigates the discrepancies in the update frequencies among different components. Moreover, our training process is divided into two phases. We initially warm up the model with labeled data only, ensuring that the parameters reach a stable state before the introduction of complex interactions. Following this, the model is trained jointly with all the available data. We use the Adam optimizer for its well-known efficiency and effectiveness. In the training, we use Adam (Kingma & Ba, 2015) optimizer and set the batch size to 256, with the learning rate set to 0.001.