reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-class Heterogeneous Domain Adaptation

Authors: Joey Tianyi Zhou, Ivor W. Tsang, Sinno Jialin Pan, Mingkui Tan

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Lastly, we conduct extensive experiments on both synthetic and real-world datasets to demonstrate the superiority of our proposed method over existing state-of-the-art HDA methods in terms of prediction accuracy and training efﬁciency. In Section 6, we conduct extensive experiments on both toy and real-world datasets to demonstrate the effectiveness and efﬁciency of SHFR.
Researcher Affiliation	Academia	Joey Tianyi Zhou EMAIL Institute of High Performance Computing (IHPC) Agency for Science, Technology and Research (A*STAR), Singapore 138632. Ivor W. Tsang EMAIL Centre for Artiﬁcial Intelligence University of Technology Sydney Australia. Sinno Jialin Pan EMAIL School of Computer Science and Engineering Nanyang Technological University, Singapore 639798. Mingkui Tan EMAIL School of Software Engineering South China University of Technology, China.
Pseudocode	Yes	Algorithm 1 Matching pursuit LASSO for solving the nonnegative LASSO problem. Initialize the residual ξ0 = b, I0 = , and let t = 1. 1: Compute q = D ξt 1, choose the B largest qj, and record their indices by Jt. 2: Let It = It 1 Jt. 3: Let g Ic t = 0, and solve the subproblem (18) to update g It. 4: Compute ξt = b Dg It. 5: Terminate if the stopping condition is achieved. Otherwise, let t = t + 1 and go to step 1.
Open Source Code	No	The paper does not provide explicit access to source code (e.g., a repository link or a clear statement of release) for the methodology described.
Open Datasets	Yes	We conduct experiments on three real-world datasets, Multilingual Reuters Collection, BBC Collection, and Cross-lingual Sentiment Dataset, to verify the effectiveness and efﬁciency of SHFR. Multilingual Reuters Collection is available at http://multilingreuters.iit.nrc.ca/ReutersMultiLingualMultiView.htm. BBC Collection is available at http://mlg.ucd.ie/datasets/segment.html. Cross-lingual Sentiment Dataset is available at http://www.uni-weimar.de/cms/medien/webis/research/corpora/corpus-webis-cls-10.html.
Dataset Splits	Yes	For Multilingual Reuters Collection: For each class, we randomly select 100 instances from the source domain and 10 instances from the target domain for training. We randomly select 10,000 instances from the target domain as the test data. For BBC Collection: We randomly select 70% source domain instances and 10 target domain instances for each class for training. The remaining target domain instances are considered to be the test data. For Cross-lingual Sentiment Dataset: We randomly select 1,500 source domain instances and 10 target domain instances per class for training, and use the rest 5,970 target domain instances for testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only mentions experiments were conducted on toy and real-world datasets.
Software Dependencies	No	We use linear SVMs to build the base classiﬁers for the source and target domains, which are obtained using the Liblinear solver (Fan et al., 2008) and pre-trained ofﬂine. The paper mentions a specific solver (Liblinear) but does not provide a version number.
Experiment Setup	Yes	For our proposed method SHFR, we empirically set the maximum number of iterations to 20, λ = 0.01 and B = 2 in the BGMP algorithm, and generate the ECOC as long as possible for each dataset. We use linear SVMs as the base classiﬁers, the regularization parameter C of which is cross-validated from the range of {0.01, 0.1, 1, 10, 100} on the source domain data.