reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Marginalizing Stacked Linear Denoising Autoencoders

Authors: Minmin Chen, Kilian Q. Weinberger, Zhixiang (Eddie) Xu, Fei Sha

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 7 we present an extensive set of results evaluating m SLDA on several text classiﬁcation and object recognition tasks.
Researcher Affiliation	Collaboration	Minmin Chen EMAIL Criteo Palo Alto, CA 94301, USA Kilian Q. Weinberger EMAIL Zhixiang (Eddie) Xu EMAIL Department of Computer Science and Engineering Washington University in St. Louis St. Louis, MO 63130, USA Fei Sha EMAIL Computer Science Department University of Southern California Los Angeles, CA 90089, USA
Pseudocode	Yes	Algorithm 1 m LDA (for blankout corruption) in MATLABTM. Algorithm 2 m SLDA in MATLABTM.
Open Source Code	No	No explicit statement or link to the authors' source code is provided in the paper. While the algorithms are presented in a MATLABTM-like format, there is no direct release statement or repository link.
Open Datasets	Yes	First, we consider a domain adaptation task for sentiment analysis. We evaluate m SLDA on the Amazon reviews benchmark data sets (Blitzer et al., 2006) together with several other algorithms for representation learning and domain adaptation. In this section, we evaluate m SLDA on a dataset collected by Saenko et al. (2010) for studying domain shifts in visual category recognition tasks, together with several other algorithms designed for this dataset. We use the Reuters RCV1/RCV2 multilingual, multiview text categorization test collection (Amini et al., 2010) for evaluation.
Dataset Splits	Yes	The controlled smaller dataset was created by Blitzer et al. (2007), which contains reviews of four types of products: books, DVDs, electronics, and kitchen appliances. Here, each domain consists of 2,000 labeled inputs and approximately 4,000 unlabeled ones (varying slightly between domains) and the two classes are exactly balanced. For the source domain, 8 labels per category for webcam/d SLR and 20 for amazon are available, meanwhile only three labels from the target domain are used in training as well. Five runs of experiments, each one with a set of randomly selected labels, are carried out and we report the averaged accuracies. As shown in Figure 8, we gradually increase the size of the labeled subset. For each setting, we average over 10 runs of each algorithm and report the mean accuracy as well as the variance.
Hardware Specification	Yes	All experiments were conducted on an off-the-shelf desktop with dual 6-core Intel i7 CPUs clocked at 2.66Ghz.
Software Dependencies	No	The paper mentions implementing algorithms in MATLABTM and using LIBLINEAR (Fan et al., 2008) for SVMs, but specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	There are two sets of meta-parameters in m SLDA: the corruption parameters (e.g. p in the case of blankout corruption) and the number of layers L. In our experiments, both are set with 5-fold cross validation on the labeled data from the source domain. All hyper-parameters are set by 5-fold cross validation on the source training set. The blank-out noise is used in this experiment, with the corruption level cross-validated within [0.1, 0.9] of 0.1 interval.