Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction
Authors: Peng Lin, Martin Neil, Norman Fenton
JAIR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms. ... In section 5 we present experiments where we compare TRC with VI based algorithms and other region-based algorithms using challenging and high tree-width (also high dimensional) BNs as test cases. |
| Researcher Affiliation | Collaboration | Peng Lin EMAIL School of Statistics Capital University of Economics and Business Beijing 100070, P. R. China Martin Neil EMAIL Norman Fenton EMAIL School of Electronic Engineering and Computer Science Queen Mary University of London London E1 4NS, and Agena Ltd, UK |
| Pseudocode | Yes | We summarize the ORI algorithm in Algorithm 1. ... Algorithm 1: ORI algorithm ... Algorithm 2: RGBF algorithm ... The TRC algorithm (Algorithm 3) is a sequential combination of ORI and its optimization, RGBF, and CCCP. ... Algorithm 3: TRC algorithm |
| Open Source Code | Yes | The TRC code, test cases and random factors are publicly available in our code repository (Lin, 2020). |
| Open Datasets | Yes | All other test models are publicly obtainable. ... We use the Bayes Grid BN (den Broeck et al., 2014) to test the ability of ORI and the competing algorithms to find the most efficient regions. ... PASCAL challenge models (Elidan et al., 2011) including Promedas (t.w. 4 to 28, variables 400 to 900), Pedigree (t.w. 19, variables 385) and Protein 8 (t.w. 33, variables 14306). ... BNs hosted in the Bayesian nets repository (Elidan, 1998) including Barley (Kristensen, 1998) (t.w. 5, variables 48), Pedigree Pigs (Jensen, 1998) (t.w. 11, variables 441), Diabetes (Andreassen et al., 1991) (t.w. 5, variables 413), the linkage analysis model (Jensen & Kong, 1996) (t.w. 13, variables 714), and the Munin model (Andreassen et al., 1989) (t.w. 8, variables 1041). |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits with exact percentages, sample counts, or references to predefined splits for reproduction. It mentions simulating instances of random factors (e.g., "simulated 100 instances of random factors", "simulated ten instances of random factors") and testing on specific models, but not dataset splitting methodology. |
| Hardware Specification | Yes | The environment for testing was Java JDK 1.8, Intel i5 4300m. ... The CPU processor was an i5 4300m. |
| Software Dependencies | Yes | The environment for testing was Java JDK 1.8, Intel i5 4300m. We also used the existing software package fast Inf (Jaimovich et al., 2010), merlin (Marinescu, 2019) and run GBP (Gelfand, 2011) for references of the testing. ... We implemented the CCCP algorithm in (Agena Risk, 2020) without applying any particular optimization to the messages... |
| Experiment Setup | Yes | The convergence threshold is 1e-08. ... Fig. 11 (d) shows the CCCP results when not using RGBF, with a convergence threshold of 3.0E-03 and the number of inner loops set at 3. ... When RGBF is used with CCCP, and also using an even more challenging convergence threshold of 1.0E-08 and with the number of inner loops was set at 4, the results show significant improvement. ... For random generated factors we define normal factors being generated by a uniform distribution over [0, 1] and extreme factors (marked by * ) are random factors near zero and one. |