Probabilistic Geometric Principal Component Analysis with application to neural data
Authors: Han-Lin Hsieh, Maryam Shanechi
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In simulations and brain data analyses, we show that PGPCA can effectively model the data distribution around various given manifolds and outperforms PPCA for such data. Moreover, PGPCA provides the capability to test whether the new geometric coordinate system better describes the data than the Euclidean one. |
| Researcher Affiliation | Academia | Han-Lin Hsieh Ming Hsieh Department of Electrical and Computer Engineering Viterbi School of Engineering University of Southern California Los Angeles, CA, U.S.A EMAIL Maryam M. Shanechi Ming Hsieh Department of Electrical and Computer Engineering Thomas Lord Department of Computer Science Alfred E. Mann Department of Biomedical Engineering Viterbi School of Engineering University of Southern California Los Angeles, CA, U.S.A EMAIL |
| Pseudocode | Yes | Our pseudo code summarizes PGPCA EM in Algorithm 1. |
| Open Source Code | No | The paper does not explicitly state that source code is being released or provide a link to a code repository. |
| Open Datasets | Yes | For data analysis, we utilized the neural firing rates recorded from mice s brains. This dataset is publicly available (Peyrache & Buzsáki, 2015), and further details can be found in Peyrache et al. (2015). |
| Dataset Splits | Yes | We split the 15000 samples equally into 5 trials for 5-fold cross-validation. In each fold, we concatenated 4 trials to form a training set. |
| Hardware Specification | No | It takes about 13 minutes on a regular desktop computer to learn a 10D PGPCA (Ge COV) model with 12000 training samples. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For both cases, we generate 5000 training samples from the true model and learn every PGPCA model (Eu COV/Ge COV) with 500 landmarks z1:500 in (9) using 20 EM iterations. The PGPCA model dimension can be m [0, 2]. For all four cases, we generate 50000 samples from the true model, and every PGPCA model (uni Ang/uni Torus Eu COV/Ge COV) with 1000 landmarks z1:1000 is learned using 40 EM iterations. For both mice, we first computed the firing rates by smoothing the spike time-series with a Gaussian kernel with a standard deviation of 100 ms. The firing rates were then down-sampled to 15000 samples with a 100 ms step size (equivalent to data-duration of 25 minutes in total). As preprocessing following prior work, we first applied a square root on the firing rates to stabilize the variance (Chaudhuri et al., 2019), and then projected the data using Isomap (Tenenbaum et al., 2000) from R50 (Mouse 12) and R22 (Mouse 28) to R10. |