Argumentative Reasoning in ASPIC+ under Incomplete Information
Authors: Daphne Odekerken, Tuomo Lehtonen, Johannes P. Wallner, Matti Järvisalo
JAIR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our contributions consist of a theoretical analysis of the complexity of deciding stability and relevance as well as first exact algorithms for reasoning about stability and relevance in incomplete ASPIC+ theories. ... Furthermore, we provide an open-source implementation of the algorithms, and show empirically that the implementation exhibits promising scalability on both real-world and synthetic data. |
| Researcher Affiliation | Academia | Authors Contact Information: Daphne Odekerken, orcid: 0000-0003-0285-0706, EMAIL, Department of Information and Computing Sciences, Utrecht University, and National Police Lab AI, Netherlands Police, Utrecht, The Netherlands; Tuomo Lehtonen, orcid: 0000-0001-6117-4854, EMAIL, Department of Computer Science, University of Helsinki, and Department of Computer Science, Aalto University, Helsinki, Finland; Johannes P. Wallner, orcid: 0000-0002-3051-1966, EMAIL, Institute of Software Engineering and Artificial Intelligence, Graz University of Technology, Graz, Austria; Matti Järvisalo, orcid: 0000-0003-2572-063X, EMAIL, Department of Computer Science, University of Helsinki, Helsinki, Finland. |
| Pseudocode | Yes | Our algorithmic approach to deciding whether a given queryable is 𝑗-relevant for a given literal is presented as Algorithm 1. ... Algorithm 2. ... We present two separate ASP encodings for deciding the justification status of literals: one (𝜋<-just) taking rule preferences into account, the other (𝜋just) assuming that = . ... Listing 1 Module 𝜋common ... Listing 2 Module Δ𝑗𝑢𝑠𝑡 ... Listing 3 Module Δ<-𝑗𝑢𝑠𝑡 |
| Open Source Code | Yes | Furthermore, we provide an open-source implementation of the algorithms, and show empirically that the implementation exhibits promising scalability on both real-world and synthetic data. ... The implementation is available in open source at https://bitbucket.org/coreo-group/raspic2. |
| Open Datasets | No | For real-world benchmarks, we generated instances for the stability and relevance problems based on the argumentation system 𝐴𝑆= (L, , R, ) and set of queryables Q used in an inquiry system for the intake of online trade fraud at the Netherlands Police [38]. ... To further study the scalability of our implementations, we also consider synthetic data. For this, we generated argumentation theories and queryable sets that are parametrised by the size of the language |L| and rule set size |R|. Explanation: The paper uses instances generated based on an inquiry system and generated synthetic data. It does not provide concrete access information (link, DOI, repository, or clear statement of public availability) for either the real-world or synthetic datasets. |
| Dataset Splits | No | To generate stability instances, we obtained knowledge bases by randomly sampling 25 consistent subsets of each size between 1 and 14 from Q, as well as the empty knowledge base. Similarly, instances for relevance were created for each combination of stability instances and a queryable in Q, randomly sampled from the set of queryables that are not axioms and whose contradictory is not an axiom. Explanation: The paper describes how instances were generated for benchmarks (random sampling of knowledge bases and selection of topic/queryable), but it does not specify traditional training/test/validation splits for a dataset. |
| Hardware Specification | Yes | All experiments were run on 2.50 GHz Intel Xeon Gold 6248 machines under a per-instance time limit of 600 seconds and memory limit of 32 GB. |
| Software Dependencies | Yes | We use Clingo [25, 23, 24] (version 5.5.1) as the ASP solver and its incremental (multi-shot) features [24] for implementing the CEGAR algorithms for relevance. |
| Experiment Setup | Yes | All experiments were run on 2.50 GHz Intel Xeon Gold 6248 machines under a per-instance time limit of 600 seconds and memory limit of 32 GB. ... For the language size (|L|), we generated instances for the stability instances with |L| {50, 100, 150, 200, 250, 500, 1000, 2500, 5000} and for the relevance instances with |L| {50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150}. The number of rules was chosen to be |R| { 1 2 |L|, |L|, 3 2 |L|}. The body size of rules was chosen to be between 1 and 5, with one third of the rules having one rule antecedent, another third having two antecedents, and the remaining third was split equally to have three, four, or five antecedents. The literal layer distribution was selected by having 2 3 |L| literals with layer 0, each one-tenth of the literals for layers 1, 2, and 3, and the remaining ones with layer 4. The ratio between queryables and literal (|Q|/|L|) is 0.5. The ratio between axioms and queryables (|K|/|Q|) is 0.5. |