reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Adaptation Based on Generalized Discrepancy

Authors: Corinna Cortes, Mehryar Mohri, Andrés Muñoz Medina

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 6, we report the results of experiments demonstrating that our algorithm improves upon the DM algorithm in several tasks. ... We now present the results of evaluating our algorithm against several other adaptation algorithms.
Researcher Affiliation	Collaboration	Corinna Cortes EMAIL Google Research, 111 8th ave, New York, NY 10011 Mehryar Mohri EMAIL Andr es Mu noz Medina EMAIL Courant Institute of Mathematical Sciences, 251 Mercer Street, New York, NY 10012
Pseudocode	No	The paper describes the algorithm's formulation and optimization steps in Section 3 and Section 5, but it does not present a clearly labeled pseudocode block or algorithm. The steps are integrated into the descriptive text.
Open Source Code	Yes	The source code for our algorithm as well as all other baselines described in this section can be found at http://cims.nyu.edu/~munoz.
Open Datasets	Yes	The ﬁrst task we considered is given by the 4 kin-8xy Delve data sets (Rasmussen et al., 1996). ... For our next experiment we considered the cross-domain sentiment analysis data set of Blitzer et al. (2007). ... Finally, we considered a novel domain adaptation task (Tommasi et al., 2014) of paramount importance in the computer vision community. The domains correspond to 4 well known collections of images: bing, caltech256, sun and imagenet.
Dataset Splits	Yes	A sample of 200 points from each domain was used for training and 10 labeled points from the target distribution were used to select H . The experiment was carried out 10 times and the results of testing on a sample of 400 points from the target domain are reported in Figure 3(a). ... For each pair of adaptation tasks we sampled 700 points from the source distribution and 700 unlabeled points from the target. Only 50 labeled points from the target distribution were used to tune the parameter r of our algorithm. The ﬁnal evaluation is done on a test set of 1000 points. ... We sampled 800 labeled points from the source distribution and 800 unlabeled points from the target distribution as well as 50 labeled target points to be used for validation of r. The results of testing on 1000 points from the target domain are presented in Table 3.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'ridge regression' and 'QP implementation' but does not specify particular software libraries or their version numbers (e.g., specific Python libraries, solver versions).
Experiment Setup	Yes	for all adaptation algorithms we selected the parameter λ via 10-fold cross validation over the training data by using a grid search over the set of values λ {2 10, . . . , 210}. The hyper-parameters of this algorithm [KMM] were set to the recommended values of B = 1000 and ϵ = m m 1 . ... The bandwidth for the kernel was selected from the set σd: σ = 2 5, . . . , 25 via validation on the test set, where d is the mean distance between points sampled from the source domain. ... where the parameter λr was chosen from the same set as λ via validation on a small amount of data from the target distribution.