reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Estimation of Sparse Gaussian Graphical Models with Hidden Clustering Structure

Authors: Meixia Lin, Defeng Sun, Kim-Chuan Toh, Chengjing Wang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments on both synthetic data and real data demonstrate the good performance of our model, as well as the eﬃciency and robustness of our proposed algorithm.
Researcher Affiliation	Academia	Meixia Lin EMAIL Engineering Systems and Design Singapore University of Technology and Design... Defeng Sun EMAIL Department of Applied Mathematics The Hong Kong Polytechnic University... Kim-Chuan Toh EMAIL Department of Mathematics and Institute of Operations Research and Analytics National University of Singapore... Chengjing Wang EMAIL School of Mathematics Southwest Jiaotong University...
Pseudocode	Yes	Algorithm 1 : s GS-ADMM... Algorithm 2 : p ALM... Algorithm 3 : SSN
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It only references a third-party tool's link (https://CRAN.R-project.org/package=spectralGraphTopology) for visualization.
Open Datasets	Yes	The RNA-Seq Cancer Genome Atlas Research Network (Weinstein et al., 2013; Kumar et al., 2020)... We use the Animals data set (Kemp and Tenenbaum, 2008; Egilmez et al., 2017; Kumar et al., 2020)... We consider the Zoo data set from the UCI Machine Learning Repository...
Dataset Splits	No	The paper describes generating `p` independent samples for synthetic data and using full real datasets for estimation, but does not specify training/test/validation splits as the methodology involves graphical model estimation on the entire dataset rather than typical supervised learning splits.
Hardware Specification	Yes	All experiments are implemented in Matlab R2022b on a windows workstation (16-core, Intel Xeon Gold 6244 @ 3.60GHz, 128 G RAM).
Software Dependencies	Yes	All experiments are implemented in Matlab R2022b on a windows workstation (16-core, Intel Xeon Gold 6244 @ 3.60GHz, 128 G RAM).
Experiment Setup	Yes	In each method, we always have one tuning parameter ρ, which will be selected by grid search. Speciﬁcally, for our estimator, we select ρ in the range of 5 10 5 to 5 10 3 with 20 equally divided grid points... we ﬁx the iteration number of the s GS-ADMM in Phase I to be 200... we stop the algorithm when max{RP , RD, RC} < Tol, with Tol = 10 6 as the default.