NS4S: Neighborhood Search for Scheduling Problems Via Large Language Models

Authors: Junjie Zhang, Canhui Luo, Zhouxing Su, Qingyun Zhang, Zhipeng Lü, Junwen Ding, Yan Jin

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted on 349 benchmark instances across three classical scheduling problems. The results demonstrate that our algorithm significantly outperforms existing state-of-the-art methods. For JSP, our algorithm reduces the average optimality gap from 10.46% to 1.35% on Taillard s instances compared to reinforced adaptive staircase curriculum learning. For flexible JSP (FJSP), it reduces the gap from 13.24% to 0.05% on Brandimarte s instances compared to deep reinforcement learning methods. Furthermore, for FJSP with sequence dependent setup time, our algorithm updates 9 upper bounds for benchmark instances.
Researcher Affiliation Academia School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China EMAIL, EMAIL
Pseudocode Yes Algorithm 1 The main framework of the NS4S algorithm Input: JSP Instance Output: the best solution found so far S 1: S S Init(), 2: f S.makespan 3: while stopping condition is not met do 4: move(o , k) LLMGuided Neighbor Eval(S) /*Section 3.2*/ 5: S S move(o , k) 6: f S.makespan 7: Weight Update By LLMs() /*Section 3.3*/ 8: if f > f then 9: S S, f f 10: end if 11: end while 12: Return S
Open Source Code Yes NS4S not only surpasses the Memetic Algorithm (MA) in terms of both solution quality and computational efficiency but also establishes nine new upper bounds across the benchmark instances1. 1https://github.com/Zoommy/NS4S
Open Datasets Yes Benchmark Datasets: To evaluate the efficacy of the proposed NS4S algorithm, we conducted comprehensive experiments across three classical scheduling problems: JSP, FJSP, and FJSP-SDST. For JSP, we utilized the TA and DMU benchmark instances, comprising 160 instances that represent the most challenging publicly available test cases. For FJSP, we employed the BR, HU, and Vdata benchmark suites, encompassing 169 diverse instances. For FJSP-SDST, we employed the SDST-Hudata benchmark set, containing 20 instances of varying complexity.
Dataset Splits No The paper discusses various benchmark instances for evaluation (TA, DMU, BR, HU, Vdata, SDST-Hudata) and mentions that the Ve Evo framework uses "randomly generated instances" for training its heuristics. However, it does not specify how these benchmark instances themselves are split into training, validation, or test sets for the overall experimental evaluation, nor does it provide details on the generation of the "randomly generated instances" used for training Ve Evo.
Hardware Specification Yes Our implementation combines Python for the Ve Evo framework and C++ for the neighborhood search algorithm, executed on a single CPU E5-2697v3.
Software Dependencies No Our implementation combines Python for the Ve Evo framework and C++ for the neighborhood search algorithm, executed on a single CPU E5-2697v3. The paper mentions the programming languages used (Python and C++) but does not provide specific version numbers for these languages or any libraries, frameworks, or solvers utilized.
Experiment Setup Yes To ensure fair comparison with Ye et al. [2024], we configured Ve Evo with a population size of 10 and the maximum evaluations of 100. We imposed cutoff times of 10 seconds for JSP and FJSP, and 30 seconds for FJSP-SDST.