Bayesian Network Learning via Topological Order

Authors: Young Woong Park, Diego Klabjan

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A computational experiment is presented for the Gaussian Bayesian network learning problem, an optimization problem minimizing the sum of squared errors of regression models with L1 penalty over a feature network with application of gene network inference in bioinformatics. ... In the computational experiment, we compare the performance of the proposed MIP model and algorithms against the algorithm in Han et al. (2016) and other available MIP models for synthetic and real instances.
Researcher Affiliation Academia Young Woong Park EMAIL College of Business Iowa State University Ames, IA 50011, USA Diego Klabjan EMAIL Department of Industrial Engineering and Management Sciences Northwestern University Evanston, IL 60208, USA
Pseudocode Yes Algorithm 1 TOSA (Topological Order Swapping Algorithm) Algorithm 2 IR (Iterative Reordering) Algorithm 3 Greedy Algorithm 4 GD (Gradient Descent)
Open Source Code No The paper mentions that "the R script of the original algorithm is available on the journal website" referring to the benchmark algorithm from Han et al. (2016), but it does not provide any statement or link for the open-sourcing of the code for the methodology described in this paper.
Open Datasets Yes We first test all algorithms with synthetic instances generated using R package pcalg (Kalisch et al., 2012). ... Finally, in Section 5.4, we solve a popular real instance of Sachs et al. (2005) in the literature. The data set has been studied in many works including Friedman et al. (2008), Shojaie and Michailidis (2010), Fu and Zhou (2013), and Aragam and Zhou (2015).
Dataset Splits No The paper describes generating synthetic data and using a real-world flow cytometry dataset. For synthetic data, it details the generation process but not explicit train/test/validation splits. For the real data, it states "The data set has ... n = 7466 cells obtained from multiple experiments with m = 11 measurements" but does not specify how this data was split for training, testing, or validation purposes in their experiments.
Hardware Specification Yes For all computational experiments, a server with two Xeon 2.70GHz CPUs and 24GB RAM is used.
Software Dependencies Yes The MIP models MIPcp, MIPin, and MIPto are implemented with CPLEX 12.6 in C#. For GD and IR, the algorithms are written in R (R Core Team, 2016). We use glmnet package (Friedman et al., 2010) function glmnet for solving LASSO linear regression problems in (16). Function random DAG is used to generate a DAG and function rmv DAG is used to generate multivariate data with the standard normal error distribution. First, a DAG is generated by random DAG function. Next, the generated DAG and random coefficients are used to create each column (with standard normal error added) by rmv DAG function which uses linear regression as the underlying model. After obtaining the data matrix from the package, we standardize each column to have zero mean with standard deviation equal to one. The DAG used to generate the multivariate data is considered as the true structure or true arc set while it may not be the optimal solution for the score function. The random instances are generated for various parameters described in the following.
Experiment Setup Yes For IR, we use parameters α = 0.01, t = 10, νlb = 0.8, and νub = 1.2. For GD, we use parameters α = 0.01, t 1 = 10, and t 2 = 5. ... We use four λ values differently defined for each data set in order to cover the expected number of arcs with the four λ values. For each sparse instance, we solve (14) with λ {1, 0.5, 0.1, 0.05}. For dense data sets, a wide range of λ values are needed to obtain selected arc sets that have similar cardinalities with the true arc sets. Hence, for each dense instance, instead of fixed values over all instances in the set, we use λ values based on expected density d: λ = λ0 10 (10 d 1), where λ0 {1, 0.1, 0.01, 0.001}. For each high dimensional instance, we use λ {1, 0.8, 0.6, 0.4}.