reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Feature-Level Domain Adaptation

Authors: Wouter M. Kouw, Laurens J.P. van der Maaten, Jesse H. Krijthe, Marco Loog

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation of flda focuses on problems comprising binary and count data in which the transfer can be naturally modeled via a dropout distribution, which allows the classiﬁer to adapt to diﬀerences in the marginal probability of features in the source and the target domain. Our experiments on several real-world problems show that flda performs on par with state-of-the-art domain-adaptation techniques.
Researcher Affiliation	Academia	Wouter M. Kouw EMAIL Laurens J.P. van der Maaten EMAIL Department of Intelligent Systems Delft University of Technology Mekelweg 4, 2628 CD , the Netherlands Jesse H. Krijthe EMAIL Department of Intelligent Systems Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands Department of Molecular Epidemiology Leiden University Medical Center Eindhovenweg 20, 2333 ZC Leiden, the Netherlands Marco Loog EMAIL Department of Intelligent Systems Delft University of Technology Mekelweg 4, 2628 CD Delft, the Netherlands The Image Group, University of Copenhagen Universitetsparken 5, DK-2100, Copenhagen , Denmark
Pseudocode	Yes	Algorithm 1 Binary flda with dropout transfer model and quadratic loss function. procedure flda-q(S, T) for d=1,. . . , m do ηd = \|S\| 1 P xi S 1xid =0 ζd = \|T\| 1 P zj T 1zjd =0 θd = max n 0, 1 ζd / ηd o end for w = XX + diag θ 1 θ XX 1Xy Element-wise product return sign(w Z) end procedure
Open Source Code	No	The paper discusses the source code of a third-party tool, libsvm, that the authors used: 'We made use of the libsvm package by Chang and Lin (2011) with a radial basis function kernel and we performed cross-validation to estimate the kernel bandwidth and the ℓ2-regularization parameter. All multi-class classiﬁcation is done through an one-vs-one scheme. This method can be readily compared to subspace alignment (sa) and transfer component analysis (tca) to evaluate the eﬀects of the respective adaptation approaches.' and 'Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.' There is no explicit statement about releasing the source code for their own methodology (flda).
Open Datasets	Yes	We have collected six data sets from the UCI machine learning repository (Lichman, 2013) with missing data: Hepatitis (hepat.), Ozone (ozone; Zhang and Fan, 2008), Heart Disease (heart; Detrano et al., 1989), Mammographic masses (mam.; Elter et al., 2007), Automobile (auto), and Arrhythmia (arrhy.; Guvenir et al., 1997). We created a domain adaptation setting by considering two handwritten digit sets, namely MNIST (Le Cun et al., 1998) and USPS (Hull, 1994). The Oﬃce-Caltech data set (Hoﬀman et al., 2013). The Internet Movie Database (IMDb) (Pang and Lee, 2004). two data sets from the UCI machine learning repository: one containing 4205 emails from the Enron spam database (Klimt and Yang, 2004) and one containing 5338 text messages from the SMS-spam data set (Almeida et al., 2011). We performed a similar experiment on the Amazon sentiment analysis data set of product reviews (Blitzer et al., 2007).
Dataset Splits	Yes	The source training and validation data was generated from the same bivariate Poisson distributions as in Figure 2. The target data was constructed by generating additional source data and dropping out the ﬁrst feature with a probability of 0.5. Each of the four data sets contained 10, 000 samples. The experiment was repeated 50 times for every sample size to calculate the standard error of the mean. In the experiments, we construct the training set (source domain) by selecting all samples with no missing data, with the remainder as the test set (target domain). In all experiments, we estimate the hyperparameters, such as ℓ2-regularization parameters, via cross-validation on held-out source data.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. It only describes the methodology and datasets.
Software Dependencies	No	The paper mentions 'libsvm package by Chang and Lin (2011)' and provides a link, but it does not specify a version number for libsvm that was used in their experiments. No other specific software dependencies with version numbers are provided.
Experiment Setup	Yes	In the ﬁrst experiment, we generate binary features by drawing 100, 000 samples from two bivariate Bernoulli distributions. The marginal distributions are 0.7 0.7 for class one and 0.3 0.3 for class two. The source data is transformed to the target data using a dropout transfer model with parameters θ = 0.5 0 . In the second experiment, we generate count features by sampling from bivariate Poisson distributions. Herein, we used rate parameters λ = 2 2 for the ﬁrst class and λ = 6 6 for the second class.