reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Gradient Scarcity in Graph Learning with Bilevel Optimization

Authors: Hashem Ghanem, Samuel Vaiter, Nicolas Keriven

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results validate our analysis and show that this issue also occurs with the Approximate Personalized Propagation of Neural Predictions (APPNP), which approximates a model of infinite receptive field. We finally present our empirical results in Section 6. 6 Experiments: We use the real-world citation networks Cora (Lu & Getoor, 2003), Cite Seer (Bhattacharya & Getoor, 2007), and Pub Med (Namata et al., 2012), and two other synthetic datasets to validate our findings. [...] Table 1: Accuracies obtained on citation networks with the Bilevel Optimization framework (BO), the same framework optimizing a G2G model (BO+G2G), and the same framework equipped with graph regularization (BO+Reg). We also benchmark against GAM (the result is reported from the according paper) and against Aobs. Each experiment is repeated 5 times and the average accuracies are reported. For each dataset, the first line (in black) corresponds to the test accuracy, whereas the second line (in gray) corresponds to the training accuracy on Vout. The highest and second-highest test accuracies for each dataset are bolded. The training accuracy on Vtr is higher than 96% in all experiments.
Researcher Affiliation	Academia	Hashem Ghanem EMAIL Institut de Mathématiques de Bourgogne, CNRS, Université de Bourgogne, France Samuel Vaiter EMAIL CNRS & Université Côte d Azur, Laboratoire J. A. Dieudonné Nicolas Keriven EMAIL CNRS & IRISA Institut de Recherche en Informatique et Systèmes Aléatoires
Pseudocode	No	The paper describes methods and models using mathematical formulations and textual descriptions (e.g., equations 1, 3, 4, 5, 6), but it does not contain any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Our Python implementation is available at https://github.com/hashemghanem/Gradients_scarcity_graph_learning.
Open Datasets	Yes	We use the real-world citation networks Cora (Lu & Getoor, 2003), Cite Seer (Bhattacharya & Getoor, 2007), and Pub Med (Namata et al., 2012), and two other synthetic datasets to validate our findings.
Dataset Splits	Yes	From the default train/validation/test split in Yang et al. (2016); Kipf & Welling (2017), we use the training set as the inner training set Vtr, while we use half of the validation set as the outer training set Vout. The other half is kept as a validation set as in Franceschi et al. (2019). [...] The first procedure randomly samples 100 nodes from the set V , hence Vtr is well-spread, whereas the second procedure selects the 100 nodes with the smallest Euclidean distance to the point (0.5, 0.5), thus Vtr is concentrated in a small neighborhood in this case. In both cases, we randomly sample 25 nodes from V to construct Vout. The remaining nodes are equally divided between the validation and the test sets. [...] Vtr includes nodes in {0, 1, . . . , n/8} {7n/8, . . . , n 1}, i.e., near the two ends of the 1-dimensional class. Vout = {3n/8, . . . , 5n/8}, i.e., centered around the middle of the class. Remaining nodes are equally divided into a validation and a test set.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions software frameworks used.
Software Dependencies	No	Models: G2G and GNN models are implemented using Py Torch (Paszke & al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019), respectively. [...] In this work, we use the Higher package (Grefenstette et al., 2019) to perform the aforementioned automatic Differentiation. The paper mentions software packages like Py Torch, Py Torch Geometric, and Higher, along with citations to their respective papers. However, it does not specify exact version numbers for these software dependencies (e.g., PyTorch 1.9, Python 3.8), which is required for reproducibility.
Experiment Setup	Yes	Setup: we use Adam (Kingma & Ba, 2014) as the inner and the outer optimizer with the default parameters of Py Torch, except for the inner learning rate ηin and the outer one ηout, which are tuned from the set {10 4, 10 3, . . . , 10}. The best values were ηin = ηout = 10 2 for the citation datasets. For the cheaters dataset, ηout = 10 3 adopting a G2G model, while ηout = 10 2 in other cases, and ηin = 10 2. For the synthetic dataset 1, ηout = 10 1 and ηin = 10. We set τin with a grid search. For the citation datasets, τin = 500 adopting the Laplacian regularization, and τin = 100 otherwise. For the cheaters dataset, τin = 200. For the synthetic dataset 1, τin = 500. The learnable parameters of the inner model are initialized at random after each outer iteration. We adopt the default initialization of Py Torch and Py Torch Geometric for the parameters of the GNN models. [...] We set the number of outer iterations τout to 150 for synthetic datasets and to 300 for citation datasets. [...] All hidden layers are followed by the ELU activation function (Clevert et al., 2015). The G2G output layer is followed by the sigmoid function. The output layer of GNN models is followed by the softmax function.