Automated Strategy Invention for Confluence of Term Rewrite Systems
Authors: Liao Zhang, Fabian Mitterwallner, Jan Jakubuv, Cezary Kaliszyk
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we focus on confluence of term rewrite systems, and apply AI techniques to invent strategies for automatic confluence proving. Moreover, we randomly generate a large dataset to analyze confluence for term rewrite systems. We improve the state-of-the-art automatic confluence prover CSI: When equipped with our invented strategies, it surpasses its human-designed strategies both on the augmented dataset and on the original human-created benchmark dataset ARI-COPS, proving/disproving the confluence of several term rewrite systems for which no automated proofs were known before. |
| Researcher Affiliation | Academia | 1University of Innsbruck 2Shanghai Jiao Tong University 3Czech Technical University in Prague 4University of Melbourne EMAIL, EMAIL, EMAIL. All listed institutions are universities, indicating academic affiliations for all authors. The email domains also include an academic domain (.ac.at) for one author, while the others use generic Gmail, but are associated with the academic institutions. |
| Pseudocode | Yes | Algorithm 1 Grackle Loop: an outline of the strategy portfolio invention loop. Algorithm 2 Term Generation |
| Open Source Code | No | The paper does not contain any explicit statement from the authors about releasing their own code or provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | We also augment the human-built confluence problems database (ARI-COPS)1, a representative benchmark for the annual confluence competition (Co Co)2. 1https://ari-cops.uibk.ac.at/ 2https://project-coco.uibk.ac.at/ |
| Dataset Splits | Yes | For training purposes, we arbitrarily select 283 examples from ARI-COPS and 800 examples from the randomly generated dataset. To build the test dataset, we exclude the examples in the training dataset, subsequently randomly selecting 800 examples from the randomly generated dataset and the remaining 283 examples from ARI-COPS. |
| Hardware Specification | Yes | Here, a CPU denotes a core of the AMD EPYC 7513 32-core processor. |
| Software Dependencies | No | The paper mentions several tools used, such as CSI, Grackle, Param ILS, and SMAC3, but does not provide specific version numbers for any of them. For example, 'CSI is one of the state-of-the-art automatic confluence provers that participates in Co Co.' and 'Grackle [H ula and Jakub uv, 2022] is a strategy optimization system...' |
| Experiment Setup | Yes | The Grackle time limit for proving a TRS is 30 seconds, employed both in the evaluation and the strategy specialization phases. During the specialization phase, Grackle launches Param ILS for parameter tuning. The overall time limit for one strategy specialization phase is 45 minutes. The total execution time of Grackle is two days. Grackle performs parallel execution in both the evaluation and specialization phases; thus, we also limit the number of CPUs it can use. For each dataset, we perform two Grackle runs, configuring the numbers of available CPUs for a single strategy run to be either one or four. When it is set to one and four, the total number of available CPUs for Grackle is set to 52 and 66, respectively. |