Multi-task Sparse Structure Learning with Gaussian Copula Models

Authors: André R. Gonçalves, Fernando J. Von Zuben, Arindam Banerjee

JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the effectiveness of the proposed model on a variety of synthetic and benchmark data sets for regression and classification. We also consider the problem of combining Earth System Model (ESM) outputs for better projections of future climate, with focus on projections of temperature by combining ESMs in South and North America, and show that the proposed model outperforms several existing methods for the problem.
Researcher Affiliation Academia Andre R. Gon calves EMAIL Fernando J. Von Zuben EMAIL School of Electrical and Computer Engineering University of Campinas S ao Paulo, Brazil Arindam Banerjee EMAIL Computer Science Department University of Minnesota Twin Cities Minneapolis, USA
Pseudocode Yes Algorithm 1: Multitask Sparse Structure Learning (MSSL) algorithm
Open Source Code Yes Data sets and code are available at: bitbucket.org/andreric/mssl-code.
Open Datasets Yes For the experiments we use 10 ESMs from the CMIP5 data set (Taylor et al., 2012). Details about the ESMs data sets are listed in Table 1. The global observation data for surface temperature is obtained from the Climate Research Unit (CRU) at the University of East Anglia. ... This data set can be found at http://www.ecmlpkdd2006.org/challenge.html. ... This data set can be found at http://yann.lecun.com/exdb/mnist/. ... This data set can be found at http://ai.stanford.edu/~btaskar/ocr/. ... This data set can be found at http://vision.ucsd.edu/content/yale-face-database.
Dataset Splits Yes We train the p-MSSL model with 50 data instances and test it on 100 data instances. ... For each scenario, the same number of training data (columns of Table 2) are used for all tasks, and the remaining data is used for test. Starting from one year of temperature measures (12 samples), we increase till ten years of data for training. The remained data was used as test set. ... It was obtained over 10 independent runs using a holdout cross-validation (2/3 for training and 1/3 for test).
Hardware Specification No Computing facilities were provided by University of Minnesota Supercomputing Institute (MSI).
Software Dependencies No The paper mentions algorithms like FISTA (Beck and Teboulle, 2009) and ADMM (Boyd et al., 2011), and uses external code bases like MALSAR, but does not provide specific version numbers for the software dependencies used in their own implementation.
Experiment Setup Yes The parameter λ0 was set to one in all experiments. ... To select the penalty parameters λ1 and λ2 we use a stability selection procedure described in Meinshausen and B uhlmann (2010). ... the regularization parameters for all algorithms were selected using cross-validation from the set {0.01, 0.1, 1, 10, 100}.