HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction
Authors: Weihuang Wen, Tianshu Yu
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have evaluated Hyper PLR on existing real-world hypergraph datasets, which consistently demonstrate superior performance and validate the effectiveness of our approach. |
| Researcher Affiliation | Academia | Weihuang Wen, Tianshu Yu School of Data Science The Chinese University of Hong Kong, Shenzhen EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Greedy Weight Coverage Algorithm 2: Generate Edge Table |
| Open Source Code | Yes | The code of our experiments is publicly available at https://github.com/LOGO-CUHKSZ/HyperPLR. |
| Open Datasets | Yes | Our experimental evaluation utilizes five real-world datasets from Benson et al. (2018): contact-high-school: This dataset is constructed from interactions recorded by wearable sensors at a high school, consisting of 327 nodes, 172,035 timestamped hyperedges, and 7,818 unique hyperedges. contact-primary-school: This dataset is constructed from interactions recorded by wearable sensors at a primary school, consisting of 242 nodes, 106,879 timestamped hyperedges, and 12,704 unique hyperedges. email-Enron: In this dataset, nodes represent email addresses at Enron, and each hyperedge comprises the sender and all recipients of an email. The dataset contains 143 nodes, email-Eu: This dataset includes email addresses at a European research institution, with hyperedges representing the sender and all recipients of an email with the same timestamp. The dataset consists of 998 nodes, 234,760 timestamped hyperedges, and 25,027 unique hyperedges. NDC-classes: In this dataset, each hyperedge corresponds to a drug, and the nodes are the class labels assigned to the drugs. The dataset consists of 1,161 nodes, 49,724 timestamped hyperedges, and 1,088 unique hyperedges. |
| Dataset Splits | No | The paper mentions evaluating on datasets and generating data five times, but does not provide specific training/test/validation splits or methodologies for creating them. It states: "Each dataset was generated five times, and the average results were reported." |
| Hardware Specification | Yes | Throughout all the experiments, Hyper PLR is running on an Apple M1 CPU. |
| Software Dependencies | No | The paper mentions using ADAM as an optimizer and tools like Node2Vec, GCN, and CELL, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The dimension of node embedding from Node2Vec is 50. The GCN consists of two layers, with input/output dimensions 50/128 and 128/128, respectively. We employ ADAM as the optimizer for all learning modules (i.e., Node2Vec, GCN, and CELL). For CELL, we set parameter edge overlap limit = 0.8 which controls the overlapping portion of the generated and the original graphs. |