Tree-based Node Aggregation in Sparse Graphical Models
Authors: Ines Wilms, Jacob Bien
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Section 4 presents the results of a simulation study. Section 5 illustrates the practical advantages of the tag-lasso on financial and microbiome data sets. We investigate the advantages of jointly exploiting node aggregation and edge sparsity in graphical models. We evaluate the estimators in terms of three performance metrics: estimation accuracy, aggregation performance, and sparsity recovery. |
| Researcher Affiliation | Academia | Ines Wilms EMAIL Department of Quantitative Economics Maastricht University Maastricht, The Netherlands Jacob Bien EMAIL Department of Data Sciences and Operations Marshall School of Business, University of Southern California California, USA |
| Pseudocode | Yes | Algorithm 1 Compute partition matrix from tag-lasso solution Algorithm 2 LA-ADMM Algorithm 3 ADMM |
| Open Source Code | Yes | An R package called taglasso implements the proposed method and is available on the Git Hub page (https://github.com/ineswilms/taglasso) of the first author. |
| Open Datasets | Yes | We demonstrate our method on a financial data set containing daily realized variances of p = 31 stock market indices from across the world in 2019 (n = 254). Daily realized variances based on five minute returns are taken from the Oxford-Man Institute of Quantitative Finance (publicly available at http://realized.oxford-man.ox.ac.uk/data/download). We next turn to a data set of gut microbial amplicon data in HIV patients (Rivera-Pinto et al., 2018) |
| Dataset Splits | Yes | To select the tuning parameters λ1 and λ2, we form a 10 10 grid of (λ1, λ2) values and find the pair that minimizes a 5-fold cross-validated likelihood-based score We take a random sample of n = 203 observations (80% of the full data set) to form a training sample covariance matrix and use the remaining data to form a test sample covariance matrix Stest, and repeat this procedure ten times. |
| Hardware Specification | No | The paper does not provide specific hardware details. It only states that simulations were performed using an R package. |
| Software Dependencies | No | All simulations were performed using the simulator package (Bien, 2016) in R (R Core Team, 2017). While R and a package are mentioned, no specific version numbers for these software components are provided in the text. |
| Experiment Setup | Yes | To select the tuning parameters λ1 and λ2, we form a 10 10 grid of (λ1, λ2) values and find the pair that minimizes a 5-fold cross-validated likelihood-based score We use the LA-ADMM algorithm with ρ1 = 0.01, Tstages = 10, maxit = 100. |