reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation

Authors: Federico Julian Camerota Verdù, Lorenzo Castelli, Luca Bortolussi

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we report experimental results on the search capabilities of LRBS and its effect on the generalization of pre-trained DRL agents to large TSP instances and two pickup and delivery variants. We use checkpoints of models from de O. da Costa et al. (2020), pre-trained on Euclidean TSP instances with 100 nodes, and from (Ma et al. 2022), pre-trained on PDTSP and PDTSPL instances with 100 nodes.
Researcher Affiliation	Academia	1Dipartimento di Matematica, Informatica e Geoscienze, Universit a degli Studi di Trieste, Italy 2Dipartimento di Ingegneria e Architettura, Universit a degli Studi di Trieste, Italy EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Limited Rollout Beam Search
Open Source Code	Yes	Code https://github.com/federico-camerota/LRBS
Open Datasets	No	The TSP instances in our experiments are generated as in Kool, van Hoof, and Welling (2019) where the coordinates of nodes are sampled uniformly at random in the unit square. We consider problems with N = {100, 150, 200, 500, 1000} nodes. To ensure a fair comparison with the pre-trained policies, for N = 100 we use the same 10, 000 test instances of de O. da Costa et al. (2020). For the other problems, we generate datasets with 1, 000 random instances for N = {125, 200} and with 128 instances for N = 500, 1000. For PDTSP and PDTSPL experiments we generate sets of 128 random instances with 200 and 500 nodes.
Dataset Splits	Yes	For the other problems, we generate datasets with 1, 000 random instances for N = {125, 200} and with 128 instances for N = 500, 1000. For PDTSP and PDTSPL experiments we generate sets of 128 random instances with 200 and 500 nodes. In the second part of Table 2, we show results on the generalization of LRBS after fine-tuning the DRL policy on a small set of randomly generated instances with the same number of nodes as the test set (FT) and when adapting the policy parameters online (OA). The LRBS configurations are the same used for the non-adaptive experiments, with the only exception of the LRBS + FT on the TSP1000 where we use (β = 10, α = 6). For all the considered problems, the FT dataset of randomly generated instances is of size equal to 10% of the test set size and each instance is solved only once by the LRBS algorithm and after each policy rollout the new parameters ϕ are updated according to the gradient in Equation 1.
Hardware Specification	Yes	We run all our experiments using a single NVIDIA Ampere GPU with 64GB of HBM2 memory.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies with version numbers.
Experiment Setup	Yes	In all our experiments on TSP, for LRBS, we set a search budget such that α β = 60 and fix the other parameters to ns = 20 and Tmax = 5000, similarly to previous works and balancing solution performance and runtime. On PDTSP and PDTSPL we reduce the budget to 40 and when doing adaptation we use ns = 10 to lower memory consumption. The best values of α and β for each dataset were determined by testing the method on a set of 10 randomly generated instances of the same size as those in the test set.