reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stable Graphical Models

Authors: Navodit Misra, Ercan E. Kuruoglu

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use simulated datasets for ﬁve benchmark network topologies to empirically demonstrate how Stab Le improves upon ordinary least squares (OLS) regression. We also apply Stab Le to microarray gene expression data for lymphoblastoid cells from 727 individuals belonging to eight global population groups. We establish that Stab Le improves test set performance relative to OLS via ten-fold cross-validation.
Researcher Affiliation	Academia	Max Planck Institute for Molecular Genetics Ihnestr. 63-73, 14195 Berlin, Germany; ISTI-CNR Via G. Moruzzi 1, 56124 Pisa, Italy and Max Planck Institute for Molecular Genetics Ihnestr. 63-73, 14195 Berlin, Germany
Pseudocode	Yes	Pseudo code for the methods is described in Algorithms 4 and 3.
Open Source Code	Yes	Source code for Stab Le and data sets used here are available at https://sourceforge.net/projects/sgmodels/. SGEX is available upon request from the ﬁrst author.
Open Datasets	Yes	We performed numerical experiments based on simulated data sets for ﬁve network topologies from the Bayesian network repository 1. These were (number of nodes, edges within brackets) : ALARM (37, 46), BARLEY (48, 84), CHILD (20, 25), INSURANCE (27, 52) and MILDEW (35, 46). Adjacency matrix for each network was downloaded from the supplement to Tsamardinos et al. (2006)2. ... We downloaded pre-processed data for 727 individuals from eight global population groups as reported in Stranger et al. (2012). Details about the eight population groups are provided in Table 2. For each individual, the input data represented log-intensities for 21800 microarray probes4 ... Data sets can be downloaded from the Array Express database http://www.ebi.ac.uk/arrayexpress/ using Series Accession Numbers E-MTAB-198 and E-MTAB-264.
Dataset Splits	Yes	We performed a ten-fold cross-validation for the top 100 ranked probes from the Hap Map data. ... For the Hap Map data, we chose each of the eight population groups in turn as the test set and learnt the optimal α-SG model for the rest of the samples.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper mentions algorithms like IRLS and K2Search, but does not provide specific version numbers for any software, libraries, or programming languages used.
Experiment Setup	Yes	We performed ﬁve sets of experiments for each network, corresponding to diﬀerent values of α = 0.8, 1.1, 1.4, 1.7, 2.0. For each set of experiments, we chose ρ = 1.0, β = 0.9 and γ = 1.0. ... For learning regression coeﬃcients during structure learning, IRLS was implemented with p = α/1.01, since lower values tended to give noisier estimates (possibly due to numerical errors). ... Stab Le also performs a ﬁxed number of random restarts to explore more of the search space. In all experiments reported here we used 10 random restarts.