reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bayesian Covariate-Dependent Gaussian Graphical Models with Varying Structure

Authors: Yang Ni, Francesco C. Stingo, Veerabhadran Baladandayuthapani

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive simulations and a case study in cancer genomics demonstrate the utility of the proposed model. Keywords: Covariate-dependent graphs; Markov random ﬁelds; Random thresholding; Subject-level inference; Undirected graphs
Researcher Affiliation	Academia	Yang Ni EMAIL Department of Statistics Texas A&M University College Station, TX 77843, USA Francesco C. Stingo EMAIL Department of Statistics, Computer Science, Applications G. Parenti The University of Florence Florence, Italy Veerabhadran Baladandayuthapani EMAIL Department of Biostatistics University of Michigan Ann Arbor, MI 48109, USA
Pseudocode	Yes	The MCMC Algorithm. Initialize model parameters. Repeat the following steps until practical convergence. (I) Update precision matrices Ωi. Scanning through each column k = 1, . . . , p, each row j = k, and each covariate ℓ= 1, . . . , q + 1, we propose β jkℓ, t jk, and β kkℓfrom qβ(β jkℓ\|βjkℓ), qt(log t jk\| log tjk), and qβ(β kkℓ\|βkkℓ)I(β kkℓ S kℓ) where qt(log t jk\| log tjk) = N(log t jk\| log tjk, η2 t ), qβ(β jkℓ\|βjkℓ) = N(β jkℓ\|βjkℓ, η2 β), and qβ(β kkℓ\|βkkℓ) = N(β kkℓ\|βkkℓ, η2 β). We accept the proposal with probability min(1, α) where
Open Source Code	No	The paper does not provide concrete access to its own source code for the methodology described. It mentions using "R package huge" for a baseline method but not for its own implementation.
Open Datasets	Yes	We use data generated by the Multiple Myeloma Research Consortium, a multi-institutional collaborative research eﬀort collected data (among others) on gene expressions and clinical parameters from MM patients (Chapman et al., 2011).
Dataset Splits	No	For the real data application, the paper states the total sample size as n = 151, but it does not specify any training, validation, or test splits. For simulations, it mentions generating a separate dataset for testing graph interpolation but this is not a conventional train/test split of a single empirical dataset.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions the "R package huge" for graphical lasso, but it does not provide a specific version number for this or any other software dependency for its own implementation.
Experiment Setup	Yes	For GGMx, we set the hyperparameters, aτ = bτ = 10 1, µt = 1, and σt = 0.2; these choices will be tested in sensitivity analyses at the end of this section. Both GGMx and BGGM were run for 10,000 iterations with 5,000 burn-in. We ran two separate MCMCs, each with 50,000 iterations, discarded the ﬁrst 50% as burn-in and saved every 50th sample after burn-in. The probability cutoﬀc was chosen to control the posterior expected FDR at 1%.