reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

UniFORM: Towards Unified Framework for Anomaly Detection on Graphs

Authors: Chuancheng Song, Xixun Lin, Hanyang Shen, Yanmin Shang, Yanan Cao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on real-world datasets demonstrate that Uni FORM significantly outperforms stateof-the-art methods across multiple granularities.
Researcher Affiliation	Academia	Chuancheng Song1,2, Xixun Lin1,2, Hanyang Shen1,2, Yanmin Shang1,2, Yanan Cao1,2* 1Institute of Information Engineering, Chinese Academy of Sciences 2 School of Cyber Security, University of Chinese Academy of Sciences EMAIL
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methodologies are described in paragraph text and mathematical formulations.
Open Source Code	No	The paper does not explicitly state that source code is provided or offer any links to a code repository.
Open Datasets	Yes	We conduct experiments using datasets from three distinct domains: Research Networks (Cora, Pubmed, COLLAB), Social Networks (Blog Catalog, Flickr, Enron, IMDB, Reddit), and Commercial Networks (Yelp, IBMAML).
Dataset Splits	No	The paper mentions using both ground-truth and artificially injected anomalies following (Liu et al. 2021) and evaluates using AUC, but it does not provide specific details on the training/validation/test splits (e.g., percentages or exact counts) for any of the datasets.
Hardware Specification	Yes	All models were run on Python 3.9.19, NVIDIA Tesla V100 GPU, 629GB RAM, and 2.20GHz Intel Xeon E5-2650 CPU.
Software Dependencies	Yes	All models were run on Python 3.9.19, NVIDIA Tesla V100 GPU, 629GB RAM, and 2.20GHz Intel Xeon E5-2650 CPU.
Experiment Setup	Yes	For efficiency and performance, we fixed the sampled community size c (central component plus c 1 hop neighbors for egograph, and random walk steps for subgraph fragments) to 4. For isolated nodes or those in smaller communities, nodes are repeatedly sampled until an overlapping community of the desired size is formed. In Langevin Dynamics, we select ϵ = 0.3 and T = 25, justified below. The energy-based GNN uses 2 layers (K = 2) to extract information from small communities, with an embedding dimension f = 64. Batch size is set to 300 for each dataset. All models are optimized using the Adam optimizer. Training epochs are 200 for Cora, Pubmed, Blog Catalog, and Flickr; 400 for Enron, IMDB, and Reddit; and 600 for COLLAB, Yelp, and IBMAML. Learning rates are 0.001 for Cora, Pubmed, Blog Catalog, and Flickr, and 0.003 for the others.