reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Efficient Training of Multi-task Neural Solver for Combinatorial Optimization

Authors: Chenguang Wang, Zhang-Hua Fu, Pinyan Lu, Tianshu Yu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments are conducted for 12 tasks: Four types of COPs, the Travelling Salesman Problem (TSP), the Capacitated Vehicle Routing Problem (CVRP), the Orienteering Problem (OP), and the Knapsack Problem (KP), and each of them with three problem scales. We compare our approach with single-task training (STL), extensive MTL baselines (Mao et al., 2021; Yu et al., 2020; Navon et al., 2022; Kendall et al., 2018; Liu et al., 2021a;b) and the SOTA task grouping method, TAG (Fifty et al., 2021) under the cases of the same training budgets and same training epochs.
Researcher Affiliation	Academia	Chenguang Wang EMAIL School of Data Science, The Chinese University of Hong Kong, Shenzhen Shenzhen Research Institute of Big Data Zhang-Hua Fu EMAIL School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen Shenzhen Institute of Artificial Intelligence and Robotics for Society Pinyan Lu EMAIL Shanghai University of Finance and Economics Tianshu Yu EMAIL School of Data Science, The Chinese University of Hong Kong, Shenzhen
Pseudocode	No	The paper describes algorithmic concepts and a pipeline (Figure 1) but does not contain a formally labeled 'Pseudocode' or 'Algorithm' block with structured, step-by-step instructions.
Open Source Code	Yes	Our code is open-sourced and available at https://github.com/LOGO-CUHKSZ/MTL-COP.
Open Datasets	Yes	On the real-world datasets of TSPLib and CVRPLib, our method also achieved the best results compared to single task learning and multi-task learning approaches. Additionally, the influence matrix provides empirical evidence supporting common practices in the field of learning to optimize, further substantiating the effectiveness of our approach.
Dataset Splits	No	The paper states that experiments are averaged over "10,000 instances for each task" and describes training for "1000 epochs," but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) for these instances.
Hardware Specification	Yes	The model is trained using 8 Nvidia Tesla A100 GPUs in parallel, and the evaluations are done on a single NVIDIA GeForce RTX 3090.
Software Dependencies	No	The paper mentions using Adam (Kingma & Ba, 2015) and an open-source bandit algorithm repository (Besson, 2018), but it does not provide specific version numbers for these or other major software dependencies like Python or deep learning frameworks, which are necessary for reproducible setup.
Experiment Setup	Yes	In each epoch, we process a total of 100 * 1000 instances with a batch size of 512. The POMO size is equal to the problem scale, except for KP-200, where it is 100. We optimize the model using Adam (Kingma & Ba, 2015) with a learning rate of 1e-4 and weight decay of 1e-6. The training of the model involves 1000 epochs in the standard setting. The learning rate is decreased by 1e-1 at the 900th epoch. During the first epoch, we use the bandit algorithm to explore at the beginning of the training process. We then collect gradient information by updating the bandit algorithm with every 12 batches of data.