Nonparametric Principal Subspace Regression

Authors: Yang Zhou, Mark Koudstaal, Dengdeng Yu, Dehan Kong, Fang Yao

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Favorable finite-sample performance is illustrated through simulated and real data examples in Section 4 and 5, respectively.
Researcher Affiliation Academia School of Statistics Beijing Normal University Beijing 100875, China; Department of Statistical Sciences University of Toronto Toronto, ON M5S 3G3, Canada; Department of Probability & Statistics Center for Statistical Science Peking University Beijing 100871, China
Pseudocode No The paper describes a 'two-step fitting procedure' in Section 2.2, but it is presented as regular text, not as a structured pseudocode block or algorithm: 'Step 1. For a given r ≤ q, let ˆU[r] = (ˆu1, . . . , ˆur) be the top r left singular vectors of data Y = (y1, . . . , yn) ∈ Rp×n from model (1); Step 2. Plug in ˆU[r] into RDn(G) and find the corresponding minimizers of the RDn(ˆuk, gk) by applying local polynomial smoothing for k = 1, . . . , r separately, denoted by ˆf1, . . . , ˆfr.'
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It only references links for publicly available datasets used in the experiments.
Open Datasets Yes We apply the proposed method to an EEG data set, which is available at https://archive.ics.uci.edu/ml/datasets/EEG+Database. For another data application, we analyze the motor task-related f MRI data from the Human Connectome Project (HCP) Data https://www.humanconnectome.org/
Dataset Splits Yes For each subject, we randomly reserve 10% of data as the test set: Stest ⊂ {1, . . . , 256} such that |Stest|/256 ≈ 10%, while using the rest as the training set, and report the prediction errors... Same as the above example, we randomly select 10% of data as the test set and the rest of the data as the training set for each subject.
Hardware Specification No No specific details about the hardware used for running the experiments are provided. The mention of '3 Tesla magnetic resonance imaging data' refers to data acquisition for the fMRI study, not the computational hardware for the proposed method.
Software Dependencies No The paper mentions using 'local polynomial regression with a Gaussian kernel' for implementation and 'five-fold cross-validation' for parameter selection, but does not specify any software names or version numbers (e.g., programming languages, libraries, frameworks, or solvers).
Experiment Setup No The paper describes the generation of simulated data and the general approach for real data applications, including data dimensions and evaluation metrics. It mentions choosing bandwidth 'by the standard five-fold cross-validation' and selecting 'r' by AIC, but it does not provide concrete hyperparameter values or detailed configurations for the local polynomial regression or other experimental settings.