A Cluster Elastic Net for Multivariate Regression
Authors: Bradley S. Price, Ben Sherwood
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulations and data examples from business operations and genomics are presented to show the merits of both the least squares and binomial methods. |
| Researcher Affiliation | Academia | Bradley S. Price EMAIL College of Business and Economics West Virginia University Morgantown, WV 26505, USA; Ben Sherwood EMAIL School of Business University of Kansas Lawrence, KS 66045, USA |
| Pseudocode | Yes | We propose a two-step iterative procedure to obtain a local minimum. 1. Begin with initial estimates, ˆβ 1 1, . . . , ˆβ 1 r. 2. For the wth step, where w > 1, repeat the steps below until the group estimates do not change: (a) Hold ˆBw 1 fixed and minimize, ˆDw 1 , . . . , ˆDw Q = minimize D1,...,DQ X ˆβ w 1 l ˆβ w 1 m 2 The above can be solved by performing K-means clustering on the r n dimensional vectors X ˆβ w 1 1 , . . . , X ˆβ w 1 r . (b) Holding ˆDw 1 , . . . , ˆDw Q fixed the wth estimate of B is ˆBw = arg min B Rp r 1 2n c=1 (yic x T i βc)2 + δ||B||1 ||X(βl βm)||2 2. |
| Open Source Code | Yes | The mcen R package that implements the methods outlined in this article is available on CRAN (Sherwood and Price, 2018). |
| Open Datasets | Yes | Votavova et al. (2011) collected gene expression profiles, demographic and birth information from 72 pregnant mothers. |
| Dataset Splits | Yes | To evaluate the methods we randomly partitioned the data into 50 training and 15 testing samples. We divide 2000 transactions into training and validation sets. The first 1000 transactions are used to train our models, with 3-fold cross validation used to select the tuning parameters for both MCEN and SEN. The predictive performance of the models are then compared using the next 1000 transactions. |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments, such as GPU/CPU models or memory amounts. |
| Software Dependencies | Yes | The mcen R package that implements the methods outlined in this article is available on CRAN (Sherwood and Price, 2018). |
| Experiment Setup | Yes | Tuning parameters for all methods are selected using 10-folds cross validation. For the MCEN and TMCEN methods cluster sizes of 2, 3 and 4 are considered. In the training data all variables are centered and scaled to have mean zero and a standard deviation of one. We filter the gene expression data for each response by using the top 25 genes in terms of absolute value of correlation with a given response. |