Block Domain Knowledge-Driven Learning of Chain Graphs Structure

Authors: Shujing Yang, Fuyuan Cao

JAIR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Meanwhile, we conduct theoretical analysis to prove the correctness of our algorithm and compare it with the LCD algorithm and MBLWF algorithm on synthetic and real-world datasets. The experimental results validate the effectiveness of our algorithm. In Section 4, it reports experimental results to illustrate the performance of the KDLCG algorithm.
Researcher Affiliation Academia Shujing Yang EMAIL Fuyuan Cao (Corresponding author) EMAIL School of Computer and Information Technology, Shanxi University, Taiyuan, 030006, Chian
Pseudocode Yes Algorithm 1: learn the Adj and SP (learn-AS) and Algorithm 2: learn LWF CG structure (KDLCG)
Open Source Code No The paper states: We implemented all algorithms in R by extending the code from the bnlearn (Javidian et al., 2020b), lcd (Ma et al., 2008), and pcalg (Kalisch et al., 2012) packages to LWF CGs. This indicates they used existing packages but does not provide a specific link or explicit statement that their own implementation of KDLCG is open-source or available.
Open Datasets Yes To evaluate the performance of the proposed algorithm, we perform extensive experiments to contrast our proposed KDLCG algorithm against the state-of-the-art LCD algorithm and Mb LWF algorithm. ... we verify the effectiveness of all algorithms on time series datasets of insilico size10 1 and insilico size10 2 with feedback loops (n = 105, p = 10) in the DREAM4 Network Inference Challenge (Marbach, Schaffter, Mattiussi, & Floreano, 2009; Greenfield, Madar, Ostrer, & Bonneau, 2010).
Dataset Splits No For synthetic data, the paper mentions: "training databases with the sample size of n = 500 or 5000 from this probability distribution" and "training databases with the sample size of n = 50 or 100 from this probability distribution." For real-world data, it mentions: "time series datasets of insilico size10 1 and insilico size10 2 with feedback loops (n = 105, p = 10)." These specify total sample sizes, but not explicit train/test/validation splits.
Hardware Specification No The paper mentions 'running time' as an evaluation metric, but does not provide any specific details about the hardware (CPU, GPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies No We implemented all algorithms in R by extending the code from the bnlearn (Javidian et al., 2020b), lcd (Ma et al., 2008), and pcalg (Kalisch et al., 2012) packages to LWF CGs. The specific version numbers for R or the listed packages are not provided.
Experiment Setup Yes For each sample, the significance levels alpha of the LCD algorithm, the Mb LWF algorithm, and our KDLCG algorithm are respectively set at the values of 0.005 or 0.05 to perform the hypothesis. We generate a random chain graph on V as follows: (1) Order the p vertices and initialize a p p adjacency matrix A with zeros; (2) For each element in the lower triangle part of A, set it to be a random number generated from a Bernoulli distribution with probability of occurrence s = N/(p 1); ... (6) Set Aij = 0 for any (i, j) pair such that i Il, j Im with l > m. We consider random CGs with p ={10, 20, 40} and N ={2, 3}.