reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction

Authors: Weihuang Wen, Tianshu Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We have evaluated Hyper PLR on existing real-world hypergraph datasets, which consistently demonstrate superior performance and validate the effectiveness of our approach.
Researcher Affiliation	Academia	Weihuang Wen, Tianshu Yu School of Data Science The Chinese University of Hong Kong, Shenzhen EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Greedy Weight Coverage Algorithm 2: Generate Edge Table
Open Source Code	Yes	The code of our experiments is publicly available at https://github.com/LOGO-CUHKSZ/HyperPLR.
Open Datasets	Yes	Our experimental evaluation utilizes five real-world datasets from Benson et al. (2018): contact-high-school: This dataset is constructed from interactions recorded by wearable sensors at a high school, consisting of 327 nodes, 172,035 timestamped hyperedges, and 7,818 unique hyperedges. contact-primary-school: This dataset is constructed from interactions recorded by wearable sensors at a primary school, consisting of 242 nodes, 106,879 timestamped hyperedges, and 12,704 unique hyperedges. email-Enron: In this dataset, nodes represent email addresses at Enron, and each hyperedge comprises the sender and all recipients of an email. The dataset contains 143 nodes, email-Eu: This dataset includes email addresses at a European research institution, with hyperedges representing the sender and all recipients of an email with the same timestamp. The dataset consists of 998 nodes, 234,760 timestamped hyperedges, and 25,027 unique hyperedges. NDC-classes: In this dataset, each hyperedge corresponds to a drug, and the nodes are the class labels assigned to the drugs. The dataset consists of 1,161 nodes, 49,724 timestamped hyperedges, and 1,088 unique hyperedges.
Dataset Splits	No	The paper mentions evaluating on datasets and generating data five times, but does not provide specific training/test/validation splits or methodologies for creating them. It states: "Each dataset was generated five times, and the average results were reported."
Hardware Specification	Yes	Throughout all the experiments, Hyper PLR is running on an Apple M1 CPU.
Software Dependencies	No	The paper mentions using ADAM as an optimizer and tools like Node2Vec, GCN, and CELL, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	The dimension of node embedding from Node2Vec is 50. The GCN consists of two layers, with input/output dimensions 50/128 and 128/128, respectively. We employ ADAM as the optimizer for all learning modules (i.e., Node2Vec, GCN, and CELL). For CELL, we set parameter edge overlap limit = 0.8 which controls the overlapping portion of the generated and the original graphs.