reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High-Dimensional Inference for Cluster-Based Graphical Models

Authors: Carson Eisenach, Florentina Bunea, Yang Ning, Claudiu Dinicu

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	As an illustration of the usage of these newly developed inferential tools, we show that they can be reliably used for recovery of the sparsity pattern of the graphs we study, under FDR control, which is veriﬁed via simulation studies and an f MRI data analysis. These experimental results conﬁrm the theoretically established diﬀerence between the two graph structures. Furthermore, the data analysis suggests that the latent variable graph, corresponding to the unobserved cluster centers, can help provide more insight into the understanding of the brain connectivity networks relative to the simpler, average-based, graph. This section contains simulations and a real data analysis that illustrate the ﬁnite sample performance of the inferential procedures developed in the previous sections for the latent variable graph and cluster-average graph, respectively.
Researcher Affiliation	Academia	Carson Eisenach EMAIL Department of Operations Research and Financial Engineering Princeton University Princeton, NJ 08544, USA Florentina Bunea EMAIL Yang Ning EMAIL Claudiu Dinicu EMAIL Department of Statistics and Data Science Cornell University Ithaca, NY 14850, USA
Pseudocode	Yes	For completeness, we outline the PECOK algorithm below, which consists in a convex relaxation of the K-means algorithm, further tailored to estimation of clusters G = {G 1, . . . , G K} deﬁned via the interpretable model (1). The PECOK algorithm consists in the following three steps: 1. Compute an estimator eΓ of the matrix Γ . 2. Solve the semi-deﬁnite program (SDP) b B = argmax B D bΣ eΓ, B , (4) where bΣ is the sample covariance matrix and B 0 (symmetric and positive semideﬁnite) P a Bab = 1, b Bab 0, a, b tr(B) = K 3. Compute b G by applying a clustering algorithm on the rows (or equivalently columns) of b B.
Open Source Code	No	The paper does not explicitly provide a link to source code developed by the authors for the methods described in this paper, nor does it contain an explicit statement of code release for this work. It mentions the FORCE algorithm and cites Eisenach and Liu (2019) for it, but this is a tool used, not the code for the current paper's methodology.
Open Datasets	Yes	As an illustration, we focus on the publicly available resting-state f MRI data from the Neuro-bureau pre-processed repository (Bellec et al., 2015). Speciﬁcally, we use the data from patient 1018959, session 1 in the KKI dataset.
Dataset Splits	No	The paper describes generating synthetic datasets for simulations but does not specify training/test/validation splits (e.g., percentages, counts, or references to standard splits). For the fMRI dataset, it mentions using data from 148 time periods but does not detail how this data was split for experimental evaluation purposes.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, processor types, memory amounts, or cloud/cluster specifications) used for running the experiments or simulations.
Software Dependencies	No	The paper mentions applying the 'FORCE algorithm (Eisenach and Liu, 2019)' but does not provide specific version numbers for this or any other software, libraries, or solvers that would be necessary to replicate the experiments.
Experiment Setup	Yes	The regularization parameters λ and λ are chosen by 5-fold cross validation.