reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

GOAL: A Generalist Combinatorial Optimization Agent Learner

Authors: Darko Drakulić, Sofia Michel, Jean-Marc Andreoli

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We train GOAL on a set of routing, scheduling and classic graph problems and show that it is only slightly inferior to the specialized baselines while being the first multi-task model that solves a wide range of COPs. Finally we showcase the strong transfer learning capacity of GOAL by fine-tuning it on several new problems. Our code is available at this url. [...] 4 EXPERIMENTS Our experiments aim to address the following questions: (i) does the GOAL approach enable learning a single backbone to effectively solve a variety of CO problems (Sec 4.1)? (ii) can it be used as a pretrained model and efficiently fine-tuned to solve new CO problems, unseen at training (Sec 4.2)? (iii) to what extent do our mixed-attention blocks and multi-type architecture contribute to the results (Sec 4.3)?
Researcher Affiliation	Industry	Darko Drakuli c, Sofia Michel & Jean-Marc Andreoli NAVER LABS Europe, France EMAIL
Pseudocode	No	The paper describes the architecture and methods in prose, with mathematical equations for attention (e.g., equations 1a, 1b, and 2). However, it does not include a dedicated section or figure explicitly labeled "Pseudocode" or "Algorithm", nor are there any structured, code-like blocks describing a procedure or method.
Open Source Code	Yes	Our code is available at this url. [...] REPRODUCIBILITY STATEMENT For full reproducibility, the implementation of the backbone, the adapters and all the experiments described in this paper are publicly available at https://github.com/naver/goal-co.
Open Datasets	No	For each problem, we generate a training dataset of 1 million random instances and solve them using traditional solvers. [...] The description of the problems and the associated data generation processes is provided in Appendix F. [...] Data generation. Node locations are randomly sampled from the unit square, while node demands are randomly generated integers from the interval [1, 10].
Dataset Splits	Yes	Training tasks. We train GOAL on eight classic and varied CO problems. For each problem, we generate a training dataset of 1 million random instances and solve them using traditional solvers. [...] The best model was selected based on the average performance on a validation set of 128 instances per problem. [...] In Table 1 we compare the performance of GOAL to the relevant neural baselines for each problem as well as its single-task version (i.e. GOAL trained separately for each task). First, we observe that the best performance is achieved by the single-task GOAL in 7 out of the 8 tasks, with a computation time comparable to that of the baselines.
Hardware Specification	Yes	We trained the model on 8 Nvidia V100 GPU servers, with batch size of 256 for 7 days, completing ca. 400 epochs.
Software Dependencies	No	The paper mentions using "Adam W optimizer" and references traditional solvers like "LKH (Helsgaun, 2017)", "HGS (Vidal, 2022)", "ORTools (Perron & Furnon, 2022)", "Fast WVC (Cai, 2023)", and "Hi GHS (Huangfu & Hall, 2018)" for generating expert solutions. However, it does not provide specific version numbers for the software libraries or frameworks used in the implementation of GOAL itself.
Experiment Setup	Yes	We use Adam W optimizer (Loshchilov & Hutter, 2018), with an initial learning rate of 0.0005 and a decay rate of 0.97 every 10th epoch. We trained the model on 8 Nvidia V100 GPU servers, with batch size of 256 for 7 days, completing ca. 400 epochs. The best model was selected based on the average performance on a validation set of 128 instances per problem. The model hyperparameters are described in Appendix A. [...] The GOAL backbone consists of L=9 layers with an embedding size of D= D=128 (the same for both nodes and edges), a feed-forward dimension of 512, and Re Zero normalization Bachlechner et al. (2021). The mixed-attention blocks have 8 heads. The task specific input adapters produce representations of low dimension ℓ=8 for nodes and ℓ=4 for edges, which are mapped to embeddings by a shared codebook of dimension ℓ D for nodes and ℓ D for edges, respectively.