Nonparametric Principal Subspace Regression
Authors: Yang Zhou, Mark Koudstaal, Dengdeng Yu, Dehan Kong, Fang Yao
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Favorable finite-sample performance is illustrated through simulated and real data examples in Section 4 and 5, respectively. |
| Researcher Affiliation | Academia | School of Statistics Beijing Normal University Beijing 100875, China; Department of Statistical Sciences University of Toronto Toronto, ON M5S 3G3, Canada; Department of Probability & Statistics Center for Statistical Science Peking University Beijing 100871, China |
| Pseudocode | No | The paper describes a 'two-step fitting procedure' in Section 2.2, but it is presented as regular text, not as a structured pseudocode block or algorithm: 'Step 1. For a given r ≤ q, let ˆU[r] = (ˆu1, . . . , ˆur) be the top r left singular vectors of data Y = (y1, . . . , yn) ∈ Rp×n from model (1); Step 2. Plug in ˆU[r] into RDn(G) and find the corresponding minimizers of the RDn(ˆuk, gk) by applying local polynomial smoothing for k = 1, . . . , r separately, denoted by ˆf1, . . . , ˆfr.' |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It only references links for publicly available datasets used in the experiments. |
| Open Datasets | Yes | We apply the proposed method to an EEG data set, which is available at https://archive.ics.uci.edu/ml/datasets/EEG+Database. For another data application, we analyze the motor task-related f MRI data from the Human Connectome Project (HCP) Data https://www.humanconnectome.org/ |
| Dataset Splits | Yes | For each subject, we randomly reserve 10% of data as the test set: Stest ⊂ {1, . . . , 256} such that |Stest|/256 ≈ 10%, while using the rest as the training set, and report the prediction errors... Same as the above example, we randomly select 10% of data as the test set and the rest of the data as the training set for each subject. |
| Hardware Specification | No | No specific details about the hardware used for running the experiments are provided. The mention of '3 Tesla magnetic resonance imaging data' refers to data acquisition for the fMRI study, not the computational hardware for the proposed method. |
| Software Dependencies | No | The paper mentions using 'local polynomial regression with a Gaussian kernel' for implementation and 'five-fold cross-validation' for parameter selection, but does not specify any software names or version numbers (e.g., programming languages, libraries, frameworks, or solvers). |
| Experiment Setup | No | The paper describes the generation of simulated data and the general approach for real data applications, including data dimensions and evaluation metrics. It mentions choosing bandwidth 'by the standard five-fold cross-validation' and selecting 'r' by AIC, but it does not provide concrete hyperparameter values or detailed configurations for the local polynomial regression or other experimental settings. |