UniFORM: Towards Unified Framework for Anomaly Detection on Graphs
Authors: Chuancheng Song, Xixun Lin, Hanyang Shen, Yanmin Shang, Yanan Cao
AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on real-world datasets demonstrate that Uni FORM significantly outperforms stateof-the-art methods across multiple granularities. |
| Researcher Affiliation | Academia | Chuancheng Song1,2, Xixun Lin1,2, Hanyang Shen1,2, Yanmin Shang1,2, Yanan Cao1,2* 1Institute of Information Engineering, Chinese Academy of Sciences 2 School of Cyber Security, University of Chinese Academy of Sciences EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. Methodologies are described in paragraph text and mathematical formulations. |
| Open Source Code | No | The paper does not explicitly state that source code is provided or offer any links to a code repository. |
| Open Datasets | Yes | We conduct experiments using datasets from three distinct domains: Research Networks (Cora, Pubmed, COLLAB), Social Networks (Blog Catalog, Flickr, Enron, IMDB, Reddit), and Commercial Networks (Yelp, IBMAML). |
| Dataset Splits | No | The paper mentions using both ground-truth and artificially injected anomalies following (Liu et al. 2021) and evaluates using AUC, but it does not provide specific details on the training/validation/test splits (e.g., percentages or exact counts) for any of the datasets. |
| Hardware Specification | Yes | All models were run on Python 3.9.19, NVIDIA Tesla V100 GPU, 629GB RAM, and 2.20GHz Intel Xeon E5-2650 CPU. |
| Software Dependencies | Yes | All models were run on Python 3.9.19, NVIDIA Tesla V100 GPU, 629GB RAM, and 2.20GHz Intel Xeon E5-2650 CPU. |
| Experiment Setup | Yes | For efficiency and performance, we fixed the sampled community size c (central component plus c 1 hop neighbors for egograph, and random walk steps for subgraph fragments) to 4. For isolated nodes or those in smaller communities, nodes are repeatedly sampled until an overlapping community of the desired size is formed. In Langevin Dynamics, we select ϵ = 0.3 and T = 25, justified below. The energy-based GNN uses 2 layers (K = 2) to extract information from small communities, with an embedding dimension f = 64. Batch size is set to 300 for each dataset. All models are optimized using the Adam optimizer. Training epochs are 200 for Cora, Pubmed, Blog Catalog, and Flickr; 400 for Enron, IMDB, and Reddit; and 600 for COLLAB, Yelp, and IBMAML. Learning rates are 0.001 for Cora, Pubmed, Blog Catalog, and Flickr, and 0.003 for the others. |