Transfer Learning in Information Criteria-based Feature Selection
Authors: Shaohan Chen, Nikolaos V. Sahinidis, Chuanhou Gao
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, simulation studies and applications with real data demonstrate the usefulness of the TLCp scheme. |
| Researcher Affiliation | Academia | Shaohan Chen EMAIL School of Mathematical Sciences Zhejiang University Hangzhou 310027, China; Nikolaos V. Sahinidis EMAIL H. Milton Stewart School of Industrial & Systems Engineering and School of Chemical & Biomolecular Engineering Georgia Institute of Technology Atlanta, GA 30332, USA; Chuanhou Gao EMAIL School of Mathematical Sciences Zhejiang University Hangzhou 310027, China |
| Pseudocode | Yes | Algorithm 1 Using the approximate Cp method to select features; Algorithm 2 Using the approximate TLCp method to select features for the target task |
| Open Source Code | Yes | The source code for reproducing the experimental results is available at https://github.com/Shaohan-Chen/ Transfer-learning-in-Mallows-Cp. |
| Open Datasets | Yes | In this subsection, we evaluate the performance of the proposed TLCp method on school data used by Bakker and Heskes (2003), Argyriou et al. (2008) and Zhou et al. (2011)... We finally test the proposed TLCp methods using the Parkinson’s telemonitoring data set from the UCI Machine Learning Repository (Tsanas et al., 2009). |
| Dataset Splits | Yes | For each target data size (n = 210, 250, 290), we randomly split the target data set (furnace A) 300 times with n samples as the training set and the remaining 100 samples as the test set. For each target sample size (n = 130, 150, 170), we divide the target data set into 10000 random splits with n samples as the training data and the remaining 30 samples as the test data. For each sample size (n = 100, 110), we randomly split the target data set 5000 times with n samples as the training set and the remaining 30 as the test set. |
| Hardware Specification | Yes | All experiments in this paper were conducted on a computer with a 6-core, 2.60-GHz CPU and 16-GB memory. |
| Software Dependencies | Yes | We use the software package from Zhou et al. (2011) and Mathworks (2017) to solve these two multi-task methods. We implement the aforementioned benchmarks based on the statistics and machine learning toolbox (Mathworks, 2017). |
| Experiment Setup | Yes | We chose the tuning parameter of the Cp model (4) as λ = 2, and set the parameters of the TLCp model (8) λ1, λ2, λ3, λ4 according to the tuning rules stated in Corollary 15 or Theorem 20, as λ1 = 1, λ2 = 1, λi 3 = 4/δ2 i (i = 1, , k), λ4 2. We tune the hyperparameters of the proposed TLCp methods with two tasks based on Theorem 20, as λ 1 = ˆσ2 2, λ 2 = ˆσ2 1, λt 3 = 4ˆσ2 1 ˆσ2 2 ˆδ2 t (t = 1, , k) and λ 4 = mini {1, ,k}... |