reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Foundation Models for Mixed Integer Linear Programming

Authors: Sirui Li, Janardhan Kulkarni, Ishai Menache, Cathy Wu, Beibin Li

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical results show that models trained on the data generated by MILP-Evolve achieve significant improvements on unseen problems, including MIPLIB benchmarks.
Researcher Affiliation	Collaboration	Sirui Li1 , Janardhan Kulkarni2, Ishai Menache2, Cathy Wu1, Beibin Li2 1MIT, EMAIL 2Microsoft Research, EMAIL
Pseudocode	No	The paper provides code snippets and refers to code examples in figures (e.g., Figure 6, Figure 7) and Appendix A.4.1 'MILP CODE SYNTAX' which contains actual Python code. However, it does not present a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps in a pseudocode format.
Open Source Code	Yes	Our code and data are publicly available at https://github.com/microsoft/OptiGuide.
Open Datasets	Yes	Our code and data are publicly available at https://github.com/microsoft/OptiGuide. ... MIPLIB (Gleixner et al., 2021), a widely recognized MILP benchmark dataset.
Dataset Splits	Yes	We partition the set of MILP problem classes generated by MILP-Evolve into roughly a 7:1:2 split for training, validation, and test subsets which we denote by X Evolve train , X Evolve val , and X Evolve test . ... For Integrality Gap Prediction: we split the first 800 MILP classes into 643 classes for training/validation and 157 classes for testing ( 8:2 ratio). We generate 100 instances per class, and further split all training/validation instances with a 8:2 ratio into a separate training and validation set.
Hardware Specification	Yes	We train, validate and test all methods on a distributed cluster using nodes equipped with 80 Intel(R) Xeon(R) Silver 4316 CPU and A single Nvidia Volta A100 GPU.
Software Dependencies	Yes	We use Open AI GPT-4o Achiam et al. (2023) as the LLM for the MILP class generation. ... the text description inputs are embedded via NV-Embed-v1 (Lee et al., 2024), an open-source text embedding model based on Mistral 7B (Jiang et al., 2023). ... Exact MILP solvers (Bestuzheva et al., 2021; Gurobi Optimization, LLC, 2023) typically employ the Branch-and-Bound (B&B) algorithm...
Experiment Setup	Yes	For Integrality Gap Prediction and the MILP encoder for Language-MILP Contrastive Learning, our architecture consists of the following main components: 1) We use a Graph Convolutional Network (GCN)... 2) To capture global dependencies... we use the attention mechanism (Vaswani, 2017)... 3) We include three additional nodes... 4) The attention module outputs an updated embedding... We train with Adam optimizer with a learning rate of 0.001 and a batch size of 32 with a total of 30000 gradient steps. All hyperparameters are selected on the validation set and frozen before evaluating on the test set. Table 6 and 7 provides a list of hyperparameters.