reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Tree-based Node Aggregation in Sparse Graphical Models

Authors: Ines Wilms, Jacob Bien

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 4 presents the results of a simulation study. Section 5 illustrates the practical advantages of the tag-lasso on ﬁnancial and microbiome data sets. We investigate the advantages of jointly exploiting node aggregation and edge sparsity in graphical models. We evaluate the estimators in terms of three performance metrics: estimation accuracy, aggregation performance, and sparsity recovery.
Researcher Affiliation	Academia	Ines Wilms EMAIL Department of Quantitative Economics Maastricht University Maastricht, The Netherlands Jacob Bien EMAIL Department of Data Sciences and Operations Marshall School of Business, University of Southern California California, USA
Pseudocode	Yes	Algorithm 1 Compute partition matrix from tag-lasso solution Algorithm 2 LA-ADMM Algorithm 3 ADMM
Open Source Code	Yes	An R package called taglasso implements the proposed method and is available on the Git Hub page (https://github.com/ineswilms/taglasso) of the ﬁrst author.
Open Datasets	Yes	We demonstrate our method on a ﬁnancial data set containing daily realized variances of p = 31 stock market indices from across the world in 2019 (n = 254). Daily realized variances based on ﬁve minute returns are taken from the Oxford-Man Institute of Quantitative Finance (publicly available at http://realized.oxford-man.ox.ac.uk/data/download). We next turn to a data set of gut microbial amplicon data in HIV patients (Rivera-Pinto et al., 2018)
Dataset Splits	Yes	To select the tuning parameters λ1 and λ2, we form a 10 10 grid of (λ1, λ2) values and ﬁnd the pair that minimizes a 5-fold cross-validated likelihood-based score We take a random sample of n = 203 observations (80% of the full data set) to form a training sample covariance matrix and use the remaining data to form a test sample covariance matrix Stest, and repeat this procedure ten times.
Hardware Specification	No	The paper does not provide specific hardware details. It only states that simulations were performed using an R package.
Software Dependencies	No	All simulations were performed using the simulator package (Bien, 2016) in R (R Core Team, 2017). While R and a package are mentioned, no specific version numbers for these software components are provided in the text.
Experiment Setup	Yes	To select the tuning parameters λ1 and λ2, we form a 10 10 grid of (λ1, λ2) values and ﬁnd the pair that minimizes a 5-fold cross-validated likelihood-based score We use the LA-ADMM algorithm with ρ1 = 0.01, Tstages = 10, maxit = 100.