reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Quantile Graphical Models: a Bayesian Approach

Authors: Nilabja Guha, Veera Baladandayuthapani, Bani K. Mallick

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our methods are applied to a novel cancer proteomics data dataset where-in multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology. Key-words : Graphical model, Quantile regression, Variational Bayes. In this section, we consider three simulation settings to illustrate the application of the proposed methodology.
Researcher Affiliation	Academia	Nilabja Guha nilabja EMAIL Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA 01854, USA Veera Baladandayuthapani EMAIL Department of Biostatistics University of Michigan Ann Arbor, MI 48103, USA Bani K. Mallick EMAIL Department of Statistics Texas A & M University College Station, TX 77843, USA
Pseudocode	Yes	Table 1: Variational Update Algorithm 1. Set the initial values q0(βl), q0(vl), q0(Il), q0(tl) and q0(πl). We denote the current density by qold(). For iteration in 1:N : 2. Find qnew(βl) by qnew(βl) = arg min q (βl) L q (βl)qold(Il)qold(vl)qold(πl)qold(tl), p(Y, Θl) . Initialize qold(βl) = qnew(βl). 3. Find qnew(Il) by qnew(Il) = arg min q (Il) L qold(βl)q (Il)qold(vl)qold(πl)qold(tl), p(Y, Θl) . Initialize qold(Il) = qnew(Il). 4. Find qnew(vl) by qnew(vl) = arg min q (vl) L qold(βl)qold(Il)q (vl)qold(πl)qold(tl), p(Y, Θl) . We initialize qold(vl) = qnew(vl). 5. Find qnew(πl) by qnew(πl) = arg min q (πl) L qold(βl)qold(Il)qold(vl)q (πl)qold(tl), p(Y, Θl) . Initialize qold(πl) = qnew(πl). 6. Find qnew(tl) by qnew(tl) = arg min q (tl) L qold(βl)qold(Il)qold(vl)qold(πl)q (tl), p(Y, Θl) . Initialize qold(tl) = qnew(tl). We continue until the stop criterion is met. end for 9. Return the approximation qold(βl)qold(Il)qold(vl)qold(πl)qold(tl).
Open Source Code	No	The paper does not provide a direct statement of code release, a link to a code repository, or explicitly state that code is available in supplementary materials for the methodology described.
Open Datasets	Yes	Our methods are applied to a novel cancer proteomics data dataset where-in multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology. ... The Cancer Genome Atlas (TCGA) is a source of molecular proﬁles for many diﬀerent tumor types. Functional protein analysis by reverse-phase protein arrays (RPPA) is included in TCGA and looking at the proteomic characterization the signaling network can be established. ... A comprehensive analysis of similar network can be found in Akbani et al. (2014) for various cancers, where the EGFR family along with MAPK and MEK lineage was found to be dominant determinant of signaling, where for LUSC it was mainly EGFR.
Dataset Splits	No	The paper mentions generating observations for simulations (e.g., "n = 400 independent observations") and the size of the LUSC dataset ("n = 121 observations"), but it does not specify explicit training/test/validation splits or cross-validation strategies for these datasets.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud resources) used to run the experiments.
Software Dependencies	No	We compare our methodology with the neighborhood selection method for Gaussian graphical model (GGM), using the R package huge (Zhao et al. (2012)) where the model is selected by huge.select function. We considered graphical Lasso (GLASSO) for graph estimation. ... The GGM detects a lot of extra edges along with the existing edges. Also, G1 and G2 are generally well separated by the quantile based method. Overall, the quantile based variational Bayes provides a sparser and a more accurate solution. A typical MCMC ﬁt is similar to QVB ﬁt (Figure 2) but MCMC ﬁts generally have slightly sparser graph with QVB detecting weaker connections more frequently.
Experiment Setup	Yes	For quantile based variational Bayes (QVB), the data is standardized, and we use t = 1, independent N(0, v), v = 1 prior on the coeﬃcients. The QVB graph is robust to prior variance over a range 1 v 100. ... We use a1 = 1, b1 = 1 in the Beta-Binomial prior. ... We use 40 iterations for QVB but in all the examples considered, the convergence happens within 20 iterations. Using 5000 samples for each node, and 5000 burn ins, the MCMC runtime is nearly 100 times or more of that of the QVB. ... We use our quantile based variational approach with {τl} = {.1, .2, .3, . . . , .7, .8, .9} and a normal prior on β (independent N(0, 1)).