reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SALE-MLP: Structure Aware Latent Embeddings for GNN to Graph-free MLP Distillation

Authors: Harsh Pal, Sarthak Malik, Rajat Patel, Aakarsh Malhotra

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that SALE-MLP outperforms existing G2M methods across tasks and datasets, achieving 3 4% improvement in node classification for inductive settings while maintaining strong transductive performance.
Researcher Affiliation	Industry	Harsh Pal , Sarthak Malik , Rajat Patel , Aakarsh Malhotra AI Garage, Mastercard, Gurugram, Haryana, India EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Proposed SALE-MLP
Open Source Code	Yes	For more details on the implementation and additional results (including other unsupervised structural losses), read the supporting material1. 1https://github.com/ganzagun/SALE-MLP
Open Datasets	Yes	Datasets: The experimentation utilizes six widely-adopted public benchmark datasets: (Cora, Citeseer, Pubmed) [Sen et al., 2008], (Amazon-Photo, Amazon-Computer) [Feng et al., 2022], and the large-scale graph ogbn-arxiv [Hu et al., 2020].
Dataset Splits	Yes	Furthermore, for Cora, Citeseer, and Pubmed, we follow splits from [Kipf and Welling, 2022] and for the Amazon dataset, we use splits from [Zhang et al., 2022a] i.e., using 20-shot for training, 30-shot for validation, and remaining for testing. While ogbn-arxiv uses standard splits [Hu et al., 2020].
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models or processor types used for running its experiments. It only mentions inference time without specifying the hardware.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, CUDA versions) needed to replicate the experiment.
Experiment Setup	Yes	For SALE-MLP, we perform a grid search for the hyper-parameters below on the validation data: # Hidden layers = {2,3} # Walks = {1,2,5} Walk len = {3,5,10} Pre-train Epochs = {1,2,5,10} λ = {0.0, 0.1, ... , 1.0} α = {1, 1.5, 2, 2.5, 3, 3.5, 4} Hidden layer Dimensionality = {64,128,256}