Multi-class Heterogeneous Domain Adaptation
Authors: Joey Tianyi Zhou, Ivor W. Tsang, Sinno Jialin Pan, Mingkui Tan
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Lastly, we conduct extensive experiments on both synthetic and real-world datasets to demonstrate the superiority of our proposed method over existing state-of-the-art HDA methods in terms of prediction accuracy and training efficiency. In Section 6, we conduct extensive experiments on both toy and real-world datasets to demonstrate the effectiveness and efficiency of SHFR. |
| Researcher Affiliation | Academia | Joey Tianyi Zhou EMAIL Institute of High Performance Computing (IHPC) Agency for Science, Technology and Research (A*STAR), Singapore 138632. Ivor W. Tsang EMAIL Centre for Artificial Intelligence University of Technology Sydney Australia. Sinno Jialin Pan EMAIL School of Computer Science and Engineering Nanyang Technological University, Singapore 639798. Mingkui Tan EMAIL School of Software Engineering South China University of Technology, China. |
| Pseudocode | Yes | Algorithm 1 Matching pursuit LASSO for solving the nonnegative LASSO problem. Initialize the residual ξ0 = b, I0 = , and let t = 1. 1: Compute q = D ξt 1, choose the B largest qj, and record their indices by Jt. 2: Let It = It 1 Jt. 3: Let g Ic t = 0, and solve the subproblem (18) to update g It. 4: Compute ξt = b Dg It. 5: Terminate if the stopping condition is achieved. Otherwise, let t = t + 1 and go to step 1. |
| Open Source Code | No | The paper does not provide explicit access to source code (e.g., a repository link or a clear statement of release) for the methodology described. |
| Open Datasets | Yes | We conduct experiments on three real-world datasets, Multilingual Reuters Collection, BBC Collection, and Cross-lingual Sentiment Dataset, to verify the effectiveness and efficiency of SHFR. Multilingual Reuters Collection is available at http://multilingreuters.iit.nrc.ca/ReutersMultiLingualMultiView.htm. BBC Collection is available at http://mlg.ucd.ie/datasets/segment.html. Cross-lingual Sentiment Dataset is available at http://www.uni-weimar.de/cms/medien/webis/research/corpora/corpus-webis-cls-10.html. |
| Dataset Splits | Yes | For Multilingual Reuters Collection: For each class, we randomly select 100 instances from the source domain and 10 instances from the target domain for training. We randomly select 10,000 instances from the target domain as the test data. For BBC Collection: We randomly select 70% source domain instances and 10 target domain instances for each class for training. The remaining target domain instances are considered to be the test data. For Cross-lingual Sentiment Dataset: We randomly select 1,500 source domain instances and 10 target domain instances per class for training, and use the rest 5,970 target domain instances for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only mentions experiments were conducted on toy and real-world datasets. |
| Software Dependencies | No | We use linear SVMs to build the base classifiers for the source and target domains, which are obtained using the Liblinear solver (Fan et al., 2008) and pre-trained offline. The paper mentions a specific solver (Liblinear) but does not provide a version number. |
| Experiment Setup | Yes | For our proposed method SHFR, we empirically set the maximum number of iterations to 20, λ = 0.01 and B = 2 in the BGMP algorithm, and generate the ECOC as long as possible for each dataset. We use linear SVMs as the base classifiers, the regularization parameter C of which is cross-validated from the range of {0.01, 0.1, 1, 10, 100} on the source domain data. |