reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learn2Aggregate: Supervised Generation of Chvatal-Gomory Cuts Using Graph Neural Networks

Authors: Arnaud Deza, Elias B. Khalil, Zhenan Fan, Zirui Zhou, Yong Zhang

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the ML-based separator against heuristic CG cuts and exact separation on the ﬁve families of problems in terms of ﬁnal integrality gap closure (IGC, i.e., how close the LP relaxation bound is to the optimal integer value) and total cut generation time. We ﬁnd that: 1. On average, we remove 75% of the constraints, leading to mean IGC improvements of 2%, 23%, and 93% at time reductions of 57%, 64%, and 41% for small, medium, and large instances, respectively. These are substantial improvements that validate our motivating observation 1. above as well as the GNN design described in 3.
Researcher Affiliation	Collaboration	1Department of Mechanical & Industrial Engineering, University of Toronto 2 Huawei Technologies Canada
Pseudocode	No	The paper describes the methodology and GNN architecture but does not include a formal pseudocode block or algorithm.
Open Source Code	Yes	Code https://github.com/khalil-research/ML4Cuts
Open Datasets	No	We experiment with ﬁve families of MILPs used in the research area of ML for MILP. They span a diverse range of combinatorial structures as can be seen in Table 1. For each family of problems, three instance sizes are considered. For each family and size, a dataset of 200 instances is generated. We use MIPLearn (Xavier et al. 2020) for the generation of PM and SS instances and Ecole (Prouvost et al. 2020) for the generation of FL and SC instances. Details regarding problem formulation and instance generation is also included in Appendix B. To collect training data (on a random subset of the 200 instances), we deploy the full CG-MIP separator in a pure cutting plane method to collect features/labels as well as baseline performance for the full separator.
Dataset Splits	Yes	For each dataset, we use 100, 50, and 50 instances for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments. It refers generally to 'modern solvers' and 'computational cost' but not specific hardware.
Software Dependencies	Yes	Models are trained with Pytorch 1.10 (Paszke et al. 2019) and SCIP 8.0.0 (Bestuzheva et al. 2021) is used as the MIP solver.
Experiment Setup	Yes	GNN hyperparameters are tuned using 80 random search trials. After training, the GNN s classiﬁcation threshold (between zero and one) is tuned to maximize the F1-score on validation data. [...] We restrict the maximum number of cut generation rounds to 100 and terminate early if IGC stagnates over 7 rounds to avoid diminishing returns. Additionally, motivated by the CG-MIP implementation in SCIP, we use a complemented MIR cut generation heuristic based on CG-MIP s solution pool. [...] CG-MIP s execution is stopped if an integer (optimal) solution has been found or if one of the following conditions are met: (1) A 15-second time limit is hit, (2) a feasible solution limit of 5000 is hit, or (3) an incumbent (i.e., improving) solution limit of 1000. The conservative time limit of 15 seconds is used to return cuts fast if any are found. Otherwise, the time limit is doubled and the solution limit is set to 1, i.e., the solver terminates on ﬁnding the ﬁrst cut. The time limit is doubled until a total time limit of 180 seconds is reached, in which case the cutting plane method terminates. Additionally, to ensure CG-MIP focuses on ﬁnding feasible solutions rather than proving optimality, we set SCIP s internal parameter emphasis to feasibility (Appendix D).