Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems
Authors: Fu Luo, Xi Lin, Yaoxin Wu, Zhenkun Wang, Tong Xialiang, Mingxuan Yuan, Qingfu Zhang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on both synthetic and real-world TSP and CVRP benchmarks. The results demonstrate that our NCO method achieves state-of-the-art performance on large-scale VRPs with up to 100K nodes. Our ablation study reveals the effect of the cross-attention and SIT algorithm in improving computational efficiency and solving performance for large-scale VRPs. |
| Researcher Affiliation | Collaboration | Fu Luo1, Xi Lin2, Yaoxin Wu3, Zhenkun Wang1 , Tong Xialiang4, Mingxuan Yuan4, Qingfu Zhang2 1Southern University of Science and Technology 2City University of Hong Kong 3Eindhoven University of Technology 4Huawei Noah s Ark Lab |
| Pseudocode | Yes | D PSEUDOCODE This section provides the pseudocode-based explanation of SIT in Section 4.3 to enhance clarity and understanding. Algorithm 1 is the pseudocode-based description of SIT. Algorithm 2 is for generating the initial pseudo-labels. Algorithm 3 is for the parallel local reconstruction for a single instance. Algorithm 4 is for the model training process. |
| Open Source Code | Yes | The code is available at https://github.com/CIAM-Group/SIL. |
| Open Datasets | Yes | To evaluate our method on real-world large-scale instances, we also extract all symmetric instances with EUC_2D features and more than 1K nodes from TSPLIB Reinelt (1991) and CVRPLIB Uchoa et al. (2017), a total of 33 TSP instances and 14 CVRP instances. |
| Dataset Splits | Yes | According to Fu et al. (2021), we set the number of instances for the TSP1K test dataset to 128. For datasets with larger instances, each contains 16 instances. Similarly, the CVRP test dataset includes the same number of instances, with capacities set to 250 for CVRP1K, 500 for CVRP5K, 1, 000 for CVRP10K, and 2,000 for CVRP50K/100K. ... The size D of the training dataset for scales 1K, 5K/10K, 50K/100K are 20K, 200, and 100, respectively. |
| Hardware Specification | Yes | In all experiments, we use a single NVIDIA Ge Force RTX 3090 GPU with 24GB memory for both training and testing. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' for training and references 'LKH3' and 'HGS' as solvers, but does not specify version numbers for these or other software libraries/frameworks (e.g., Python, PyTorch, CUDA) crucial for reproducibility. |
| Experiment Setup | Yes | For our Transformer network, we set the embedding dimension d = 128. The decoder employs L = 6 stacked cross-attention modules, with each attention layer including 8 attention heads and a feed-forward layer with a hidden dimension of 512. ... Each iteration in our SIT algorithm comprises 100 times of local reconstruction and 20 epochs of model training. The Adam optimizer (Kingma & Ba, 2015) is utilized for training the models, with an initial learning rate 1e-4 and a decay rate 0.97 per epoch. Throughout the SIT process, the maximum length of partial solutions lmax is 1,000 to balance efficiency and effectiveness. |