reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Improving Structure MCMC for Bayesian Networks through Markov Blanket Resampling

Authors: Chengwei Su, Mark E. Borsuk

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments across a range of network sizes show that the MBR scheme outperforms other state-of-the-art algorithms, both in terms of learning performance and convergence rate. In particular, MBR achieves better learning performance than the other algorithms when the number of observations is relatively small and faster convergence when the number of variables in the network is large.
Researcher Affiliation	Academia	Chengwei Su EMAIL Mark E. Borsuk* EMAIL Thayer School of Engineering Dartmouth College Hanover, NH 03755-8000, USA
Pseudocode	Yes	Appendix A. Markov Blanket Resampling (MBR) Algorithm For all domain variables {X1, ..., XN}, pre-compute and store lists of scores of all valid parent sets. We constrain the maximum number of parents a node can have to three in all our tests. Given the current DAG, perform an MBR move with probability pm, otherwise perform a traditional single-edge move. If an MBR move is to be performed, then proceed as follows: Randomly select a node Xi in current graph G, withprobability N 1. Store the current parent set πi of node Xi. Delete all edges pointing into Xi and into its children Xj i with the exception of Xi itself, thus obtaining G0. Find and store the set D(Xi\|G0) of all of Xi s descendant nodes in the current graph G0.
Open Source Code	No	The paper does not contain any explicit statements about making the source code available, nor does it provide a link to a code repository.
Open Datasets	Yes	Our algorithms were implemented in Matlab on a 128 GB machine with an Intel Xeon 2.00GHz processor. For the ALARM network with a simulated data set of m=1,000, precomputation of scores required 1017.6 sec, and implementation of 10,000 MCMC iterations required 382.6 sec for the standard structure algorithm, 401.5 sec for REV, and 407.3 sec for MBR. Given the 1/15 frequency of implementation, these times imply that the REV and MBR moves require approximately twice as much computation time as the conventional structure moves. This is consistent with the fact that the average number of children per node being considered in each iteration of MBR for ALARM was found to be 1.60.
Dataset Splits	No	The paper describes using simulated datasets of various sizes (e.g., "datasets of various sizes (m=50, 100, 250, 500, 1000) simulated from ALARM"), but it does not specify any training, testing, or validation splits for these datasets. The entire simulated dataset appears to be used for structure learning in each instance.
Hardware Specification	Yes	Our algorithms were implemented in Matlab on a 128 GB machine with an Intel Xeon 2.00GHz processor.
Software Dependencies	No	The paper states that "Our algorithms were implemented in Matlab." However, it does not specify a version number for Matlab or any other software dependencies with their versions.
Experiment Setup	Yes	For the standard structure sampler the burn-in was set to 500,000 iterations and 1000 DAGs were then collected by storing every 1000th iteration. For both the REV and MBR moves, the burn-in length was set to 312,500 and 1000 DAGS were collected by storing every 625th iteration. In our applications we implement the MBR move probabilistically in the MCMC algorithm, with a ﬁxed probability of pm=1/15. The traditional structure move consisting of the addition, deletion or reversal of a single edge is thus performed with probability ps=1 pm. We do, however, employ a fan-in restriction in which a maximum of three parents are allowed for any one node.