Functional L-Optimality Subsampling for Functional Generalized Linear Models with Massive Data
Authors: Hua Liu, Jinhong You, Jiguo Cao
JMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The analysis results from extensive simulation studies and from the kidney transplant data show that the functional L-optimality subsampling (FLo S) method is much better than the uniform subsampling approach and can well approximate the results based on the full data while dramatically reducing the computation time and memory. |
| Researcher Affiliation | Academia | Hua Liu EMAIL School of Economics and Finance Xi an Jiaotong University Xi an, Shaanxi 710049, China Jinhong You EMAIL School of Statistics and Management Shanghai University of Finance and Economics Shanghai 200433, China Jiguo Cao EMAIL Department of Statistics and Actuarial Science Simon Fraser University Burnaby, BC V5A 1S6, Canada |
| Pseudocode | Yes | Algorithm 1: FLo S Algorithm for Functional Generalized Linear Model |
| Open Source Code | Yes | In addition, an R package Subsampling Fun Predictors has been developed for implementing the FLo S method. The R package and the R codes for the simulation studies can be downloaded at https://github.com/caojiguo/FLo S. |
| Open Datasets | Yes | The organ transplant data from the Organ Procurement Transplant Network/United Network for Organ Sharing (Optn/UNOS) as of September 2020 is a massive functional data, which is available at https://optn.transplant.hrsa.gov/ with the permission of OPTN/UNOS. |
| Dataset Splits | No | The paper describes how the kidney transplant data recipients were categorized (e.g., 23.3% for Y=0, 76.7% for Y=1), and simulation studies generated data, but it does not specify explicit training/test/validation dataset splits used for model evaluation in the conventional sense. |
| Hardware Specification | Yes | All computations are carried out on a computation platform with Intel Xeon 5 Cpu with 4 cores and 8G memory. |
| Software Dependencies | Yes | This paper uses the R programming language (enhanced R distribution Microsoft R 4.0.2) to implement each method. |
| Experiment Setup | Yes | In the implementation, we make the number of knots K = 1.25 n1/4 according to Assumption 5, where a means the least integer greater than or equal to a. ... Usually, in practice, we choose p = 3 and K is chosen to be relatively large so that local features of β(t) can be captured. Once K and p are fixed, we can select the smoothing parameter λ by minimizing BIC. |