reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction

Authors: Peng Lin, Martin Neil, Norman Fenton

JAIR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms. ... In section 5 we present experiments where we compare TRC with VI based algorithms and other region-based algorithms using challenging and high tree-width (also high dimensional) BNs as test cases.
Researcher Affiliation	Collaboration	Peng Lin EMAIL School of Statistics Capital University of Economics and Business Beijing 100070, P. R. China Martin Neil EMAIL Norman Fenton EMAIL School of Electronic Engineering and Computer Science Queen Mary University of London London E1 4NS, and Agena Ltd, UK
Pseudocode	Yes	We summarize the ORI algorithm in Algorithm 1. ... Algorithm 1: ORI algorithm ... Algorithm 2: RGBF algorithm ... The TRC algorithm (Algorithm 3) is a sequential combination of ORI and its optimization, RGBF, and CCCP. ... Algorithm 3: TRC algorithm
Open Source Code	Yes	The TRC code, test cases and random factors are publicly available in our code repository (Lin, 2020).
Open Datasets	Yes	All other test models are publicly obtainable. ... We use the Bayes Grid BN (den Broeck et al., 2014) to test the ability of ORI and the competing algorithms to find the most efficient regions. ... PASCAL challenge models (Elidan et al., 2011) including Promedas (t.w. 4 to 28, variables 400 to 900), Pedigree (t.w. 19, variables 385) and Protein 8 (t.w. 33, variables 14306). ... BNs hosted in the Bayesian nets repository (Elidan, 1998) including Barley (Kristensen, 1998) (t.w. 5, variables 48), Pedigree Pigs (Jensen, 1998) (t.w. 11, variables 441), Diabetes (Andreassen et al., 1991) (t.w. 5, variables 413), the linkage analysis model (Jensen & Kong, 1996) (t.w. 13, variables 714), and the Munin model (Andreassen et al., 1989) (t.w. 8, variables 1041).
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits with exact percentages, sample counts, or references to predefined splits for reproduction. It mentions simulating instances of random factors (e.g., "simulated 100 instances of random factors", "simulated ten instances of random factors") and testing on specific models, but not dataset splitting methodology.
Hardware Specification	Yes	The environment for testing was Java JDK 1.8, Intel i5 4300m. ... The CPU processor was an i5 4300m.
Software Dependencies	Yes	The environment for testing was Java JDK 1.8, Intel i5 4300m. We also used the existing software package fast Inf (Jaimovich et al., 2010), merlin (Marinescu, 2019) and run GBP (Gelfand, 2011) for references of the testing. ... We implemented the CCCP algorithm in (Agena Risk, 2020) without applying any particular optimization to the messages...
Experiment Setup	Yes	The convergence threshold is 1e-08. ... Fig. 11 (d) shows the CCCP results when not using RGBF, with a convergence threshold of 3.0E-03 and the number of inner loops set at 3. ... When RGBF is used with CCCP, and also using an even more challenging convergence threshold of 1.0E-08 and with the number of inner loops was set at 4, the results show significant improvement. ... For random generated factors we define normal factors being generated by a uniform distribution over [0, 1] and extreme factors (marked by * ) are random factors near zero and one.