A Mixed-Curvature based Pre-training Paradigm for Multi-Task Vehicle Routing Solver
Authors: Suyu Liu, Zhiguang Cao, Shanshan Feng, Yew-Soon Ong
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we evaluate the proposed pre-training strategy on existing neural multi-task solvers across a variety of testing scenarios. The results demonstrate that the curvature-aware pre-training approach not only enhances the generalization capabilities of existing neural VRP solvers on synthetic datasets but also improves solution quality on real-world benchmarks. In this section, we present our experimental findings to demonstrate the effectiveness of the proposed mixed-curvature pre-training paradigm in enabling a multi-task solver for vehicle routing problems (VRPs). |
| Researcher Affiliation | Academia | 1College of Computing and Data Science, Nanyang Technological University, Singapore 2School of Computing and Information Systems, Singapore Management University, Singapore 3Centre for Frontier AI Research, The Agency for Science, Technology and Research, Singapore 4Centre for Frontier AI Research, Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore. Correspondence to: Zhiguang Cao <EMAIL>. |
| Pseudocode | No | The paper describes the methodology using prose and mathematical equations (e.g., Eq. (10), (11), (12), (13), (14)) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Our code is available at: https://github.com/ lsyysl9711/Mixed_Curvature_VRPs |
| Open Datasets | Yes | For real-world evaluations on set-X (Uchoa et al., 2017) for CVRP and set-Solomon (Solomon, 1987) for VRPTW. For real-world testing, we follow (Berto et al., 2024) and use sets A, B, E, F, M, P, and X from CVRPLib (Uchoa et al., 2017) to assess the model performance under more practical conditions. |
| Dataset Splits | Yes | For each VRP task, we pre-collect 1,000 unseen instances and report gaps relative to the optimal (or best) known solutions. The in-distribution test comprises 6 VRP tasks included during training, while the zero-shot test includes 10 tasks not seen during training. Instances in each Training Epoch 20,000. Instances in each Fine-tuning Epoch 10,000. Problem Scales {50, 100}. |
| Hardware Specification | Yes | All experiments are conducted on a machine equipped with four NVIDIA RTX A6000 GPUs, each with 48 GB of memory. |
| Software Dependencies | No | We adopt Adam as our optimizer. Table 12 lists "Optimizer Adam". No specific versions for key software components or libraries like Python, PyTorch, or CUDA are provided. |
| Experiment Setup | Yes | The learning rate, weight decay and batch size are set to 1e-4 and 1e-6 and 128, respectively. We train each model with 5,000 epochs and for each epoch there are 20,000 instances. During the last 500 epochs, we decay the learning rate by 10. Table 12 also lists: "Training Epochs 5,000", "Training Learning Rate 1e-4", "Weight Decay 1e-6", "Training Batch Size 128", "Embedding Size 128", "Hidden Feature Size 512", "Number of Encoder Layers 6", "QKV Dimension 16", "Attention Head Number 8", "Logit Clipping 10". |