reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Datalog Rewriting Algorithm for Warded Ontologies

Authors: Davide Benedetto, Marco Calautti, Hebatalla Hammad, Emanuel Sallinger, Adriano Vlad-Starrabba

IJCAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our goal is to fill the above gap by designing a novel Datalog rewriting algorithm for OMQs over warded ontologies which is amenable to practical implementations, as well as providing an implementation and an experimental evaluation, with the aim of understanding how key input parameters affect the performance of this approach, and what are its limits when combined with off-the-shelf Datalog-based engines. [...] In this section we discuss how we implemented the algorithm Warded Rewrite. Moreover, we present an experimental analysis assessing two aspects: how fast is of our implementation of Warded Rewrite to produce the Datalog rewriting, and how efficient existing Datalog-based engines are at evaluating the Datalog query produced by our algorithm.
Researcher Affiliation	Collaboration	Davide Benedetto1 , Marco Calautti2 , Hebatalla Hammad3 , Emanuel Sallinger4 , Adriano Vlad-Starrabba1 1Prometheux 2University of Milan 3University of Trento 4TU Wien, Austria and University of Oxford, UK EMAIL, EMAIL, EMAIL EMAIL, EMAIL. Prometheux is a company, while University of Milan, University of Trento, TU Wien, and University of Oxford are academic institutions, indicating a collaboration.
Pseudocode	Yes	Algorithm 1: Warded Rewrite Input: An OMQ O = (q, Σ) with Σ WARDED Output: A Datalog rewriting of O 1 Qcurrent := {can(q)}; Qexplored := ; Qdec := ; R := ; 2 while Qcurrent = do 3 Qnew := ; 4 foreach q Qcurrent do 5 if a decomposition of q w.r.t. Σ exists then // Decomposition step 6 S := decompose Optimal(q , Σ); 7 ρ := reconciliation Rule(q , S); 8 foreach q S do 9 if there is qalt Qdec with qalt q then 10 Replace head Pred(q ) with head Pred(qalt) in body(ρ); 11 Qdec := Qdec {can(q )}; 12 Qnew := Qnew {can(q )}; 13 R := R {ρ}; // Resolution step 14 foreach σ Σ do 15 foreach MGCU M of q with σ do 16 q := resolvσ,M(q ); 17 Qnew := Qnew {can(q )}; 18 R := R {rule(q )}; 19 Qexplored := Qexplored {q }; 20 Qcurrent := Qnew Qexplored; 21 return (head Pred(q), R);
Open Source Code	Yes	The full source code and the benchmark data are available at: https://gitlab.com/mcalautti/warded-rewriting-paper.
Open Datasets	Yes	For this, we generate a large pool of synthetic database-OMQ pairs (D, O), which we call scenarios, where O = (q, Σ) and Σ WARDED, using the IWARDED benchmark generator from [Atzeni et al., 2022]. [...] The full source code and the benchmark data are available at: https://gitlab.com/mcalautti/warded-rewriting-paper.
Dataset Splits	No	The paper describes generating synthetic database-OMQ pairs (scenarios) using the IWARDED benchmark generator with various parameters, and then evaluating these scenarios. It does not mention dividing a pre-existing dataset into training, validation, or test splits; rather, the data is generated for each test scenario.
Hardware Specification	Yes	For the experiments, we used an Amazon Elastic Compute Cloud (EC2) instance with an Intel(R) Xeon(TM) Platinum 8000-series @3.6 GHz, 16 GB of RAM, running Amazon Linux 2 LTS Candidate.
Software Dependencies	Yes	We implemented the algorithm Warded Rewrite in Java 22, and used open JDK 22 for its execution, while the Datalog-based reasoning engines have been executed in their default configuration.
Experiment Setup	Yes	Ontology Parameters. The average recursion length, denoted avg recursion, specifies the average length of recursive TGD sequences... The number of left-right-join recursions, denoted num lrj, specifies the number of TGDs... Database Parameters. As for the database parameters, we consider the number of facts per predicate, denoted db facts... Test Scenarios. In generating our scenarios, we consider 21 different values for the number of left-rightjoin recursions num lrj and the average recursion length avg recursion, spanning from 0 to 400 with steps of 20 each. ... Regarding the database generation, we consider a fixed value of 100k facts per predicate (db facts). Then, for each combination i, j {0, 20, 40, . . . , 400}, we generated a scenario, denoted S[i, j], of the form (D, O) with O = (q, Σ), by running IWARDED with input num lrj = i, avg recursion = j, and db facts = 100k, obtaining a total of 441 scenarios.