reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Teacher-Guided Graph Contrastive Learning

Authors: Jay Nandy, Arnab Kumar Mondal, Manohar Kaul, Prathosh AP

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical findings validate these claims on both inductive and transductive settings across diverse downstream tasks, including molecular graphs and social networks. Our experiments on benchmark datasets demonstrate that our framework consistently improves the average AUROC scores for molecules property prediction and social network link prediction.
Researcher Affiliation	Industry	Jay Nandy EMAIL Fujitsu Research India Private Limited Arnab Kumar Mondal EMAIL Fujitsu Research India Private Limited Manohar Kaul EMAIL Fujitsu Research India Private Limited Prathosh AP EMAIL Fujitsu Research India Private Limited Indian Institute of Science Bengaluru
Pseudocode	Yes	Algorithm 1: Proposed TGCL Framework
Open Source Code	Yes	Our code is available at https://github.com/jayjaynandy/TGCL.
Open Datasets	Yes	Datasets. Following the prior works (You et al., 2021; Xu et al., 2021; Kim et al., 2022), we utilize ZINC15 (Sterling & Irwin, 2015) to train the self-supervised representation learning models. Next, we finetune the models on eight different molecular benchmarks from Molecule Net (Wu et al., 2018). ...We also present results from biological domains where the datasets are produced by the sampled ego networks from the PPI networks Zitnik et al. (2019). ...For this task, we select three datasets i.e., COLLAB, IMDB-Binary, and IMDB-Multi from the TU dataset benchmark Morris et al. (2020).
Dataset Splits	Yes	We divide the datasets based on the constituting molecules scaffold (molecular substructure). ... We separate the dataset into four parts: pretraining, training, validation, and test sets in the ratio of 5:1:1:3, as in Kim et al. (2022).
Hardware Specification	Yes	For all experiments, we use PyTorch (Paszke et al., 2019) and PyTorch Geometric libraries (Fey & Lenssen, 2019) with a single NVIDIA A30 Tensor Core GPU for all of our experiments.
Software Dependencies	No	For all experiments, we use PyTorch (Paszke et al., 2019) and PyTorch Geometric libraries (Fey & Lenssen, 2019)... We use the official D-SLA codes1 provided by Kim et al. (2022).
Experiment Setup	Yes	For our proposed framework, we use the same network architecture for both the teacher and the student model. In particular, we use Graph Isomorphism Networks (GINs) (Xu et al., 2019) as applied in the previous works Hu et al. (2020a); Xu et al. (2021); Kim et al. (2022). These networks consist of 5 layers with 300 dimensional embeddings for nodes and edges along with average pooling strategies for obtaining the graph representations. ... For our experiments, we use three perturbations for each input sample. ... For TGCL-Graph CL, we use τ = 10 in Equation 5. For TGCL-DSLA, we use λ1 and λ2 to 1.0 and 0.5 respectively for the student model. For LT soft loss, we set the temperature, τ = 10 (Equation 7) and α = 0.95 (Equation 9). For LT margin, we set β = 5. Both teacher and student models are trained using batch-size of 256 and for 25 epochs with learning rate 1e-3 and Adam optimizer (Kingma & Ba, 2014).