reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Zero-shot Generalist Graph Anomaly Detection with Unified Neighborhood Prompts

Authors: Chaoxi Niu, Hezhe Qiao, Changlu Chen, Ling Chen, Guansong Pang

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on real-world GAD datasets show that UNPrompt significantly outperforms diverse competing methods under the generalist GAD setting, and it also has strong superiority under the one-model-for-one-dataset setting. Code is available at https: //github.com/mala-lab/UNPrompt.
Researcher Affiliation	Academia	1 AAII, University of Technology Sydney, Sydney, Australia 2 School of Computing and Information Systems, Singapore Management University, Singapore 3 Faculty of Data Science, City University of Macau, Macau, China
Pseudocode	No	The paper describes the methodology in prose, detailing components and their functions, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code is available at https: //github.com/mala-lab/UNPrompt.
Open Datasets	Yes	We evaluate UNPrompt on several real-world GAD datasets from diverse social networks, online shopping co-review networks, and co-purchase networks. Specifically, the social networks include Facebook [Xu et al., 2022], Reddit [Kumar et al., 2019] and Weibo [Kumar et al., 2019]. The co-review networks consist of Amazon [Mc Auley and Leskovec, 2013], Yelp Chi [Rayana and Akoglu, 2015], Amazon-all and Yelp Chi-all. Disney [S anchez et al., 2013] is a co-purchase network.
Dataset Splits	No	The paper states that UNPrompt is trained on Facebook and tested on other GAD datasets (zero-shot setting), but it does not specify explicit train/validation/test splits in terms of percentages or sample counts for the training dataset (Facebook) or any other dataset if it were used for direct training.
Hardware Specification	No	The paper does not explicitly mention any specific hardware components such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	For a fair comparison, the common dimensionality is set to eight for all methods, and SVD is used for feature projection. The number of GNN layers is set to one and the number of hidden units is 128. The transformation layer is implemented as a one-layer MLP with the same number of hidden units. The size of the neighborhood prompt is set to one. Results for other hyperparameter settings are presented in the supplementary. For all baselines, their recommended optimization and hyperparameter settings are used.