Multi-task Sparse Structure Learning with Gaussian Copula Models
Authors: André R. Gonçalves, Fernando J. Von Zuben, Arindam Banerjee
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the effectiveness of the proposed model on a variety of synthetic and benchmark data sets for regression and classification. We also consider the problem of combining Earth System Model (ESM) outputs for better projections of future climate, with focus on projections of temperature by combining ESMs in South and North America, and show that the proposed model outperforms several existing methods for the problem. |
| Researcher Affiliation | Academia | Andre R. Gon calves EMAIL Fernando J. Von Zuben EMAIL School of Electrical and Computer Engineering University of Campinas S ao Paulo, Brazil Arindam Banerjee EMAIL Computer Science Department University of Minnesota Twin Cities Minneapolis, USA |
| Pseudocode | Yes | Algorithm 1: Multitask Sparse Structure Learning (MSSL) algorithm |
| Open Source Code | Yes | Data sets and code are available at: bitbucket.org/andreric/mssl-code. |
| Open Datasets | Yes | For the experiments we use 10 ESMs from the CMIP5 data set (Taylor et al., 2012). Details about the ESMs data sets are listed in Table 1. The global observation data for surface temperature is obtained from the Climate Research Unit (CRU) at the University of East Anglia. ... This data set can be found at http://www.ecmlpkdd2006.org/challenge.html. ... This data set can be found at http://yann.lecun.com/exdb/mnist/. ... This data set can be found at http://ai.stanford.edu/~btaskar/ocr/. ... This data set can be found at http://vision.ucsd.edu/content/yale-face-database. |
| Dataset Splits | Yes | We train the p-MSSL model with 50 data instances and test it on 100 data instances. ... For each scenario, the same number of training data (columns of Table 2) are used for all tasks, and the remaining data is used for test. Starting from one year of temperature measures (12 samples), we increase till ten years of data for training. The remained data was used as test set. ... It was obtained over 10 independent runs using a holdout cross-validation (2/3 for training and 1/3 for test). |
| Hardware Specification | No | Computing facilities were provided by University of Minnesota Supercomputing Institute (MSI). |
| Software Dependencies | No | The paper mentions algorithms like FISTA (Beck and Teboulle, 2009) and ADMM (Boyd et al., 2011), and uses external code bases like MALSAR, but does not provide specific version numbers for the software dependencies used in their own implementation. |
| Experiment Setup | Yes | The parameter λ0 was set to one in all experiments. ... To select the penalty parameters λ1 and λ2 we use a stability selection procedure described in Meinshausen and B uhlmann (2010). ... the regularization parameters for all algorithms were selected using cross-validation from the set {0.01, 0.1, 1, 10, 100}. |