reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-task Sparse Structure Learning with Gaussian Copula Models

Authors: André R. Gonçalves, Fernando J. Von Zuben, Arindam Banerjee

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the eﬀectiveness of the proposed model on a variety of synthetic and benchmark data sets for regression and classiﬁcation. We also consider the problem of combining Earth System Model (ESM) outputs for better projections of future climate, with focus on projections of temperature by combining ESMs in South and North America, and show that the proposed model outperforms several existing methods for the problem.
Researcher Affiliation	Academia	Andre R. Gon calves EMAIL Fernando J. Von Zuben EMAIL School of Electrical and Computer Engineering University of Campinas S ao Paulo, Brazil Arindam Banerjee EMAIL Computer Science Department University of Minnesota Twin Cities Minneapolis, USA
Pseudocode	Yes	Algorithm 1: Multitask Sparse Structure Learning (MSSL) algorithm
Open Source Code	Yes	Data sets and code are available at: bitbucket.org/andreric/mssl-code.
Open Datasets	Yes	For the experiments we use 10 ESMs from the CMIP5 data set (Taylor et al., 2012). Details about the ESMs data sets are listed in Table 1. The global observation data for surface temperature is obtained from the Climate Research Unit (CRU) at the University of East Anglia. ... This data set can be found at http://www.ecmlpkdd2006.org/challenge.html. ... This data set can be found at http://yann.lecun.com/exdb/mnist/. ... This data set can be found at http://ai.stanford.edu/~btaskar/ocr/. ... This data set can be found at http://vision.ucsd.edu/content/yale-face-database.
Dataset Splits	Yes	We train the p-MSSL model with 50 data instances and test it on 100 data instances. ... For each scenario, the same number of training data (columns of Table 2) are used for all tasks, and the remaining data is used for test. Starting from one year of temperature measures (12 samples), we increase till ten years of data for training. The remained data was used as test set. ... It was obtained over 10 independent runs using a holdout cross-validation (2/3 for training and 1/3 for test).
Hardware Specification	No	Computing facilities were provided by University of Minnesota Supercomputing Institute (MSI).
Software Dependencies	No	The paper mentions algorithms like FISTA (Beck and Teboulle, 2009) and ADMM (Boyd et al., 2011), and uses external code bases like MALSAR, but does not provide specific version numbers for the software dependencies used in their own implementation.
Experiment Setup	Yes	The parameter λ0 was set to one in all experiments. ... To select the penalty parameters λ1 and λ2 we use a stability selection procedure described in Meinshausen and B uhlmann (2010). ... the regularization parameters for all algorithms were selected using cross-validation from the set {0.01, 0.1, 1, 10, 100}.