Learning Local Dependence In Ordered Data
Authors: Guo Yu, Jacob Bien
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show our method performing favorably compared to existing methods. We apply our method to genomic data to flexibly model linkage disequilibrium. Our method is also applied to improve the performance of discriminant analysis in sound recording classification. Keywords: Local dependence, Gaussian graphical models, precision matrices, Cholesky factor, hierarchical group lasso |
| Researcher Affiliation | Academia | Guo Yu EMAIL Department of Statistical Science Cornell University, 1173 Comstock Hall Ithaca, NY 14853, USA Jacob Bien EMAIL Department of Biological Statistics and Computational Biology and Department of Statistical Science Cornell University, 1178 Comstock Hall Ithaca, NY 14853, USA |
| Pseudocode | Yes | Algorithm 1 ADMM algorithm to solve (8) ... Algorithm 2 Algorithm for solving (10) for unweighted estimator ... Algorithm 3 BCD on the dual problem (28) |
| Open Source Code | Yes | The R package varband provides C++ implementations of Algorithms 1 and 2. ... An R (R Core Team, 2016) package, named varband, is available on CRAN, implementing our estimator. |
| Open Datasets | Yes | We study Hap Map phase 3 data from the International Hap Map project (Consortium et al., 2010). ... a classification problem described in Hastie et al. (2009). |
| Dataset Splits | Yes | To gauge the performance of our estimator on modeling LD, we randomly split the 167 samples into training and testing sets of sizes 84 and 83, respectively. ... we randomly split the data into two parts, with 10% of the data assigned to the training set and the remaining 90% of the data assigned to the test set. On the training set, we use 5-fold cross-validation to select the tuning parameter minimizing misclassification error on the validation data. |
| Hardware Specification | No | The paper does not contain specific hardware details like CPU/GPU models, memory amounts, or cloud instance types used for the experiments. It focuses on the algorithm, its theoretical properties, and empirical performance without detailing the execution environment hardware. |
| Software Dependencies | No | An R (R Core Team, 2016) package, named varband, is available on CRAN, implementing our estimator. The estimation is very fast with core functions coded in C++, allowing us to solve large-scale problems in substantially less time than is possible with the R-based implementation of the nested lasso. This mentions "R" and "C++" as implementation languages and cites an R Core Team publication from 2016, but it does not provide specific version numbers for these, the `varband` package, or any other ancillary software libraries or compilers. |
| Experiment Setup | Yes | The tuning parameter λ ≥ 0 in (5) measures the amount of regularization and determines the sparsity level of the estimator. We use 100 tuning parameter values for each estimator and repeat the simulation 10 times. ... Tuning parameters are chosen using the one-standard-error rule (see, e.g., Hastie et al., 2009). ... On the training set, we use 5-fold cross-validation to select the tuning parameter minimizing misclassification error on the validation data. |