Integrated Principal Components Analysis

Authors: Tiffany M. Tang, Genevera I. Allen

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also demonstrate the practical advantages of i PCA through extensive simulations and a case study application to integrative genomics for Alzheimer s disease. In particular, we show that the joint patterns extracted via i PCA are highly predictive of a patient s cognition and Alzheimer s diagnosis.
Researcher Affiliation Academia Tiffany M. Tang EMAIL Department of Statistics University of California, Berkeley Berkeley, CA 94720, USA Genevera I. Allen EMAIL Department of Electrical and Computer Engineering Rice University Houston, TX 77005, USA
Pseudocode Yes Algorithm 1 Flip-Flop Algorithm for Multiplicative Frobenius i PCA Estimators Algorithm 2 Flip-Flop Algorithm for i PCA Unpenalized MLEs Algorithm 3 Outline of Flip-Flop Algorithm for Penalized i PCA Covariance Estimators Algorithm 4 Flip-Flop Algorithm for Additive Frobenius Penalized i PCA Estimators Algorithm 5 Flip-Flop Algorithm for Additive L1 Penalized i PCA Covariance Estimators Algorithm 6 Flip-Flop Algorithm for Additive L1 Correlation-Penalized i PCA Estimators Algorithm 7 Selecting Penalty Parameters via Missing Imputation Framework Algorithm 8 Full MCECM Algorithm for i PCA Algorithm 9 One-Step MCECM Approximation
Open Source Code Yes R code can be found at https://github.com/Data Slingers/i PCA.
Open Datasets Yes we delve into the integrative genomics of AD and jointly analyze mi RNA expression, gene expression via RNASeq, and DNA methylation data obtained from the Religious Orders Study Memory and Aging Project (ROSMAP) Study (Mostafavi et al., 2018). The ROSMAP study is a longitudinal clinical-pathological cohort study of aging and AD, consisting of 507 subjects, 309 mi RNAs, 900 genes, and 1250 Cp G (methylation) sites after preprocessing (which we detail in Appendix I). ...mi RNA data from TCGA ovarian cancer (The Cancer Genome Atlas Research Network, 2011)
Dataset Splits Yes For the random forest, we split the ROSMAP data into a training (n = 375) and test set (n = 132) and used the default random forest settings in R.
Hardware Specification No The paper does not contain specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. It mentions 'computational constraints' but no specific hardware.
Software Dependencies No The paper mentions 'R code', 'random forest settings in R', 'huge package (Jiang et al., 2019) in R', and 'sparse Eigen R package (Benidis et al., 2016)'. However, it does not provide specific version numbers for R or any of the mentioned packages, which is required for reproducibility.
Experiment Setup Yes The stopping rule in the Flip-Flop algorithms for the additive and multiplicative Frobenius i PCA estimators is given by λ1/2|| ˆΣ 1 t ˆΣ 1 t 1||F /|| ˆΣ 1 t 1||F < 10 6, where λ denotes the mean of the penalty parameters and ˆΣt denotes the estimate of Σ in the tth iteration. Due to computational constraints, we stop the L1 Flip-Flop algorithm after one iteration, and we select the i PCA penalty parameters in a greedy manner, as discussed in Section 3.2.2. ...For the random forest, we split the ROSMAP data into a training (n = 375) and test set (n = 132) and used the default random forest settings in R. ...Here, we used the sparse Eigen R package (Benidis et al., 2016) and chose the tuning parameter such that there were only 12 non-zero features.