reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

First-Order Context-Specific Likelihood Weighting in Hybrid Probabilistic Logic Programs

Authors: Nitesh Kumar, Ondřej Kuželka, Luc De Raedt

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that FO-CS-LW scales with domain size and provide an open-source implementation of our framework.
Researcher Affiliation	Academia	Nitesh Kumar EMAIL Department of Computer Science KU Leuven, Belgium. Ondˇrej Kuˇzelka EMAIL Faculty of Electrical Engineering Czech Technical University in Prague, Czech Republic. Luc De Raedt EMAIL Department of Computer Science KU Leuven, Belgium.
Pseudocode	Yes	Algorithm 1 DC(B) Proof Procedure Algorithm 2 Simulation of DC(B) Programs Algorithm 3 DC(B) Proof Procedure Marked Algorithm 4 Generation of residual evidence s weights for DC(B) Programs Algorithm 5 Simulation of DC# Programs Algorithm 6 DC# Proof Procedure Algorithm 7 Generation of residual evidence s weights for DC# Programs
Open Source Code	Yes	We empirically demonstrate that FO-CS-LW scales with domain size and provide an open-source implementation of our framework1. This paper is a signiﬁcantly extended and completed version of our previous paper (Kumar & Kuˇzelka, 2021), where we introduced CS-LW to exploit the structures present in CPDs of BNs. The present paper ﬁrst introduces a language to describe ﬁrst-order probabilistic models and then extends CS-LW to the ﬁrst-order case. 1. The code is publicly available: https://github.com/niteshroyal/DC-Sharp
Open Datasets	Yes	We identiﬁed two BNs from the Bayesian network repository (Elidan, 2001), which have many structures within CPDs: i) Alarm, a monitoring system for patients with 37 variables; ii) Andes, an intelligent tutoring system with 223 variables. For this purpose, we compared FO-CS-LW with the inference algorithm used in Prob Log, one of the most popular PLP systems. This algorithm ﬁrst grounds ﬁrst-order programs and then performs exact inference to exploit structures of clauses in the programs. We used a Prob Log program shown in Figure 5 for this experiment. For this purpose, we used a real-world relational data generated by processing the ﬁnancial database from the PKDD 99 Discovery Challenge. Similar observations were made when we used the program learned from the real-world NBA data set as described in (Kumar et al., 2021). This data set is about basketball matches from the National Basketball Association (Schulte & Routley, 2014).
Dataset Splits	No	The paper mentions creating "multiple subsets of the financial data set with varying numbers of accounts" and "multiple subsets of the data set with varying numbers of actions (the domain size)" but does not provide specific percentages, absolute counts, or detailed methodology for splitting that would allow for reproduction without the code. For the Alarm and Andes BNs, it discusses queries and evidence but not specific dataset splits for training/testing/validation.
Hardware Specification	No	Prob Log could not compute probabilities beyond the domain size of 9 on a machine with 132 GB main memory, whereas FO-CS-LW quickly scaled to a domain size of 50. The paper refers to 'a machine with 132 GB main memory' but does not specify CPU or GPU models used for the experiments.
Software Dependencies	No	CS-LW is implemented in the Prolog programming language, thus to compare the sampling speed of LW with CS-LW, we need a similar implementation of LW. The paper mentions 'Prolog programming language' but does not specify a version number or any specific versions of libraries/packages used in their implementation.
Experiment Setup	Yes	To estimate the probabilities, FO-CS-LW used 10, 000 samples. The size of n indicates that n clients, n accounts, and n loans are present in the program. The shaded region denotes the standard deviation from the mean probability estimated by FO-CS-LW when executed 30 times for each n. 10, 000 samples were used for each of the two algorithms. The shaded region denotes the standard deviation from the mean probability estimated by algorithms when executed 100 times for each case.