Integrated Principal Components Analysis
Authors: Tiffany M. Tang, Genevera I. Allen
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also demonstrate the practical advantages of i PCA through extensive simulations and a case study application to integrative genomics for Alzheimer s disease. In particular, we show that the joint patterns extracted via i PCA are highly predictive of a patient s cognition and Alzheimer s diagnosis. |
| Researcher Affiliation | Academia | Tiffany M. Tang EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA Genevera I. Allen EMAIL Department of Electrical and Computer Engineering Rice University Houston, TX 77005, USA |
| Pseudocode | Yes | Algorithm 1 Flip-Flop Algorithm for Multiplicative Frobenius i PCA Estimators Algorithm 2 Flip-Flop Algorithm for i PCA Unpenalized MLEs Algorithm 3 Outline of Flip-Flop Algorithm for Penalized i PCA Covariance Estimators Algorithm 4 Flip-Flop Algorithm for Additive Frobenius Penalized i PCA Estimators Algorithm 5 Flip-Flop Algorithm for Additive L1 Penalized i PCA Covariance Estimators Algorithm 6 Flip-Flop Algorithm for Additive L1 Correlation-Penalized i PCA Estimators Algorithm 7 Selecting Penalty Parameters via Missing Imputation Framework Algorithm 8 Full MCECM Algorithm for i PCA Algorithm 9 One-Step MCECM Approximation |
| Open Source Code | Yes | R code can be found at https://github.com/Data Slingers/i PCA. |
| Open Datasets | Yes | we delve into the integrative genomics of AD and jointly analyze mi RNA expression, gene expression via RNASeq, and DNA methylation data obtained from the Religious Orders Study Memory and Aging Project (ROSMAP) Study (Mostafavi et al., 2018). The ROSMAP study is a longitudinal clinical-pathological cohort study of aging and AD, consisting of 507 subjects, 309 mi RNAs, 900 genes, and 1250 Cp G (methylation) sites after preprocessing (which we detail in Appendix I). ...mi RNA data from TCGA ovarian cancer (The Cancer Genome Atlas Research Network, 2011) |
| Dataset Splits | Yes | For the random forest, we split the ROSMAP data into a training (n = 375) and test set (n = 132) and used the default random forest settings in R. |
| Hardware Specification | No | The paper does not contain specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It mentions 'computational constraints' but no specific hardware. |
| Software Dependencies | No | The paper mentions 'R code', 'random forest settings in R', 'huge package (Jiang et al., 2019) in R', and 'sparse Eigen R package (Benidis et al., 2016)'. However, it does not provide specific version numbers for R or any of the mentioned packages, which is required for reproducibility. |
| Experiment Setup | Yes | The stopping rule in the Flip-Flop algorithms for the additive and multiplicative Frobenius i PCA estimators is given by λ1/2|| ˆΣ 1 t ˆΣ 1 t 1||F /|| ˆΣ 1 t 1||F < 10 6, where λ denotes the mean of the penalty parameters and ˆΣt denotes the estimate of Σ in the tth iteration. Due to computational constraints, we stop the L1 Flip-Flop algorithm after one iteration, and we select the i PCA penalty parameters in a greedy manner, as discussed in Section 3.2.2. ...For the random forest, we split the ROSMAP data into a training (n = 375) and test set (n = 132) and used the default random forest settings in R. ...Here, we used the sparse Eigen R package (Benidis et al., 2016) and chose the tuning parameter such that there were only 12 non-zero features. |