reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Probabilistic Geometric Principal Component Analysis with application to neural data

Authors: Han-Lin Hsieh, Maryam Shanechi

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In simulations and brain data analyses, we show that PGPCA can effectively model the data distribution around various given manifolds and outperforms PPCA for such data. Moreover, PGPCA provides the capability to test whether the new geometric coordinate system better describes the data than the Euclidean one.
Researcher Affiliation	Academia	Han-Lin Hsieh Ming Hsieh Department of Electrical and Computer Engineering Viterbi School of Engineering University of Southern California Los Angeles, CA, U.S.A EMAIL Maryam M. Shanechi Ming Hsieh Department of Electrical and Computer Engineering Thomas Lord Department of Computer Science Alfred E. Mann Department of Biomedical Engineering Viterbi School of Engineering University of Southern California Los Angeles, CA, U.S.A EMAIL
Pseudocode	Yes	Our pseudo code summarizes PGPCA EM in Algorithm 1.
Open Source Code	No	The paper does not explicitly state that source code is being released or provide a link to a code repository.
Open Datasets	Yes	For data analysis, we utilized the neural firing rates recorded from mice s brains. This dataset is publicly available (Peyrache & Buzsáki, 2015), and further details can be found in Peyrache et al. (2015).
Dataset Splits	Yes	We split the 15000 samples equally into 5 trials for 5-fold cross-validation. In each fold, we concatenated 4 trials to form a training set.
Hardware Specification	No	It takes about 13 minutes on a regular desktop computer to learn a 10D PGPCA (Ge COV) model with 12000 training samples.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	For both cases, we generate 5000 training samples from the true model and learn every PGPCA model (Eu COV/Ge COV) with 500 landmarks z1:500 in (9) using 20 EM iterations. The PGPCA model dimension can be m [0, 2]. For all four cases, we generate 50000 samples from the true model, and every PGPCA model (uni Ang/uni Torus Eu COV/Ge COV) with 1000 landmarks z1:1000 is learned using 40 EM iterations. For both mice, we first computed the firing rates by smoothing the spike time-series with a Gaussian kernel with a standard deviation of 100 ms. The firing rates were then down-sampled to 15000 samples with a 100 ms step size (equivalent to data-duration of 25 minutes in total). As preprocessing following prior work, we first applied a square root on the firing rates to stabilize the variance (Chaudhuri et al., 2019), and then projected the data using Isomap (Tenenbaum et al., 2000) from R50 (Mouse 12) and R22 (Mouse 28) to R10.