reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improved Approximations for Hard Graph Problems using Predictions

Authors: Anders Aamand, Justin Y. Chen, Siddharth Gollapudi, Sandeep Silwal, Hao Wu

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complement our theoretical results, we also experimentally test our algorithm compared to baselines which only run a worst-case approximation algorithm or only follow the predictions for the problem of Maximum Independent Set. 6. Experiments We complement our theoretical results with an empirical evaluation for Maximum Independent Set (MIS, Section 5). The two core component to implement the algorithms in our learning-augmented setting are predictions and an approximation algorithm. In order to test our algorithms, we consider moderately-sized graphs on which we can solve MIS using a commercial integer program solver. We generate edge predictions according to the ε-accurate model based on these ground truth labels for varying ε. Then, we utilize our high/low degree MIS algorithm using the greedy MIS approximation algorithm which iteratively picks the lowest degree node which is still available.
Researcher Affiliation	Academia	1University of Copenhagen 2MIT 3UC Berkeley 4UW-Madison 5University of Waterloo. Correspondence to: Anders Aamand <EMAIL>, Justin Y. Chen <EMAIL>, Siddharth Gollapudi <EMAIL>, Sandeep Silwal <EMAIL>, Hao WU <EMAIL>.
Pseudocode	Yes	Algorithm 1 Learned VC 1: S0 , = 100 log(1/ε)/ε2 ... Algorithm 2 Learned Maximum Independent Set 1: = 3 log(1/ε)/ε2 ... Algorithm 3 Learned-Weighted VC 1: S0 , = 100 log(1/ε)/ε2 ... Algorithm 4 Learned Set Cover 1: Jlearned , = 100 log(1/ε)/ε2 ... Algorithm 5 Learning Based MAXCUT 1: Truncation. Define the matrix Ai,j by ...
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository for the methodology described. It mentions using the third-party CPLEX solver.
Open Datasets	Yes	Datasets We use three graphs of varying sizes representing real social networks: facebook (Leskovec & Mcauley, 2012), congress (Fink et al., 2023), and twitch (Rozemberczki & Sarkar, 2021).
Dataset Splits	No	For the facebook and twitch graphs, we fix the degree threshold for Algorithm 2 to be 10, and vary ε from 0.10 to 0.35. For the congress graph, we fix the degree threshold, we fix the degree threshold to 15, and vary epsilon from 0.10 to 0.25. At the upper end of these ranges of ε both our algorithm and the predictions-only algorithm come close to approximation ratios of 1. Due to the inherent randomness in making the edge predictions, we repeat each algorithm k = 10 times for each ε and report the mean ratio.
Hardware Specification	Yes	Hardware We perform all experiments on a physical workstation with an AMD Ryzen 5900X CPU and 32 GB of RAM.
Software Dependencies	Yes	We use the CPLEX 22.1 solver (CPLEX, 2025) under an academic license.
Experiment Setup	Yes	Setup For the facebook and twitch graphs, we fix the degree threshold for Algorithm 2 to be 10, and vary ε from 0.10 to 0.35. For the congress graph, we fix the degree threshold, we fix the degree threshold to 15, and vary epsilon from 0.10 to 0.25. At the upper end of these ranges of ε both our algorithm and the predictions-only algorithm come close to approximation ratios of 1. Due to the inherent randomness in making the edge predictions, we repeat each algorithm k = 10 times for each ε and report the mean ratio.