Optimal Parameter-Transfer Learning by Semiparametric Model Averaging
Authors: Xiaonan Hu, Xinyu Zhang
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive numerical results demonstrate the superiority of the proposed method over competitive methods. ... In Section 4, we evaluate the finite sample performance of our procedure in various numerical experiments. ... In Section 5, we apply our approach to analyze housing rental information data in Beijing. |
| Researcher Affiliation | Academia | Xiaonan Hu EMAIL School of Mathematical Sciences Capital Normal University Beijing, 100048, China Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing, 100190, China Xinyu Zhang EMAIL Academy of Mathematics and Systems Science Chinese Academy of Sciences Beijing, 100190, China International Institute of Finance, School of Management University of Science and Technology of China Hefei, 230026, Anhui, China |
| Pseudocode | Yes | Algorithm 1: Trans-SMAP Input: Training samples, including the target and source data, {(x(m) i , z(m) i , y(m) i ); i = 1, . . . , nm, m = 0, . . . , M} from the target and source models (1) and the new sample {x(0) n0+1, z(0) n0+1} from the target model. Output: Prediction of y(0) n0+1 associated with the new sample {x(0) n0+1, z(0) n0+1}. |
| Open Source Code | No | The paper does not provide an explicit statement from the authors about releasing their own source code for the methodology described, nor does it include a direct link to a code repository. It mentions that "All experiments are implemented in R software" (Section 4.1) and refers to third-party R packages used (Appendix B.1), but this is not the authors' implementation code. |
| Open Datasets | Yes | In this section, we apply our approach to analyze housing rental information data in Beijing, which is drawn from a publicly available data set on http://www.idatascience.cn/dataset. |
| Dataset Splits | Yes | To determine a proper choice of weights, we adopt a J-fold (J > 1) cross-validation criterion. Specifically, we randomly divide the target samples into J mutually exclusive subgroups G1, . . . , GJ. ... Remark 2 The choice of J in criterion (4) is usually uncertain in practice. Since there is no theoretically optimal value, we manually use the 5-fold CV criterion in terms of computational efficiency in this paper. ... To evaluate the out-of-sample prediction risk, we randomly split the target samples into two subgroups with equal size as the training and testing data. |
| Hardware Specification | Yes | The numerical computation executes on a regular PC with an intel core i7-10700 2.90 GHz CPU. |
| Software Dependencies | No | The paper mentions that "All experiments are implemented in R software" (Section 4.1) and refers to the `quadprog` R package (Appendix B.1), but specific version numbers for R or any packages are not provided. |
| Experiment Setup | Yes | Set the target sample size n0 = 150, and source sample sizes (n1, n2, n3) = (200, 200, 150). For the parametric components, x(m) i from the target and source models are generated from a 6-dimensional multivariate normal distribution N(0, Σ) with Σ = [Σaa ]6 6, where Σaa = 0.5|a a |. Set the parametric coefficient vectors of the target and source models as β(0) = (1.4, 1.2, 1, 0.8, 0.65, 0.3)T , β(1) = (1.4, 1.2, 1, 0.8, 0.65, 0.3, 1.8)T + δ1, β(2) = (1.4, 1.2, 1, 0.8, 0.65, 0.3)T + δ2, and β(3) = (1.4, 1.2, 1, 0.8, 0.65, 0.3)T + δ3, where δ1 = 0.02, δ2 = 0.3, and δ3 = 0, so the parameters of the first and second source models are different from the target model, and the third source model is informative because its coefficient is exactly same as that of the target model. |