Automated Strategy Invention for Confluence of Term Rewrite Systems

Authors: Liao Zhang, Fabian Mitterwallner, Jan Jakubuv, Cezary Kaliszyk

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we focus on confluence of term rewrite systems, and apply AI techniques to invent strategies for automatic confluence proving. Moreover, we randomly generate a large dataset to analyze confluence for term rewrite systems. We improve the state-of-the-art automatic confluence prover CSI: When equipped with our invented strategies, it surpasses its human-designed strategies both on the augmented dataset and on the original human-created benchmark dataset ARI-COPS, proving/disproving the confluence of several term rewrite systems for which no automated proofs were known before.
Researcher Affiliation Academia 1University of Innsbruck 2Shanghai Jiao Tong University 3Czech Technical University in Prague 4University of Melbourne EMAIL, EMAIL, EMAIL. All listed institutions are universities, indicating academic affiliations for all authors. The email domains also include an academic domain (.ac.at) for one author, while the others use generic Gmail, but are associated with the academic institutions.
Pseudocode Yes Algorithm 1 Grackle Loop: an outline of the strategy portfolio invention loop. Algorithm 2 Term Generation
Open Source Code No The paper does not contain any explicit statement from the authors about releasing their own code or provide a link to a code repository for the methodology described.
Open Datasets Yes We also augment the human-built confluence problems database (ARI-COPS)1, a representative benchmark for the annual confluence competition (Co Co)2. 1https://ari-cops.uibk.ac.at/ 2https://project-coco.uibk.ac.at/
Dataset Splits Yes For training purposes, we arbitrarily select 283 examples from ARI-COPS and 800 examples from the randomly generated dataset. To build the test dataset, we exclude the examples in the training dataset, subsequently randomly selecting 800 examples from the randomly generated dataset and the remaining 283 examples from ARI-COPS.
Hardware Specification Yes Here, a CPU denotes a core of the AMD EPYC 7513 32-core processor.
Software Dependencies No The paper mentions several tools used, such as CSI, Grackle, Param ILS, and SMAC3, but does not provide specific version numbers for any of them. For example, 'CSI is one of the state-of-the-art automatic confluence provers that participates in Co Co.' and 'Grackle [H ula and Jakub uv, 2022] is a strategy optimization system...'
Experiment Setup Yes The Grackle time limit for proving a TRS is 30 seconds, employed both in the evaluation and the strategy specialization phases. During the specialization phase, Grackle launches Param ILS for parameter tuning. The overall time limit for one strategy specialization phase is 45 minutes. The total execution time of Grackle is two days. Grackle performs parallel execution in both the evaluation and specialization phases; thus, we also limit the number of CPUs it can use. For each dataset, we perform two Grackle runs, configuring the numbers of available CPUs for a single strategy run to be either one or four. When it is set to one and four, the total number of available CPUs for Grackle is set to 52 and 66, respectively.