Wasserstein F-tests for Frechet regression on Bures-Wasserstein manifolds

Authors: Haoshu Xu, Hongzhe Li

JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Simulations validate the accuracy of the asymptotic theory. Finally, we apply our methods to a single-cell gene expression dataset, revealing age-related changes in gene co-expression networks. In this section, we propose a Riemannian gradient descent algorithm for optimizing (15) in Section 5.1. We then present a series of numerical experiments in Section 5.2 to validate our theoretical results on the central limit theorem (Theorem 11), asymptotic null distribution (Theorem 16) and power (Theorem 18).
Researcher Affiliation Academia Haoshu Xu EMAIL Graduate Group in Applied Mathematics and Computational Science University of Pennsylvania Philadelphia, PA 19104, USA Hongzhe Li EMAIL Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Philadelphia, PA 19104, USA
Pseudocode Yes Algorithm 1 GD for Fréchet regression 1: Input: covariates {Xi}n i=1, responses {Qi}n i=1, ρ 0, n 1 , covariate x, learning rate η, initialization S0, maximum number of iterations T, threshold eps.
Open Source Code No The text is ambiguous or lacks a clear, affirmative statement of release. The paper mentions using the Python dcor package (Ramos-Carreño and Torrecilla, 2023) but does not provide specific access to their own source code for the methodology described in this paper.
Open Datasets Yes We are interested in understanding the co-expression structure of 61 genes in this KEGG nutrient-sensing pathways based on the recently published population scale single cell RNA-seq data of human peripheral blood mononuclear cells (PBMCs) from blood samples of over 982 healthy individuals with ages ranging from 20 to 90 (Yazar et al., 2022).
Dataset Splits No The paper refers to a 'single-cell gene expression dataset' but does not specify any training, validation, or test splits, nor does it refer to predefined splits from a cited source.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies Yes We compare the power of our test with that of the distance covariance test (dcov) (Székely et al., 2007) for testing independence, using the Python dcor package (Ramos-Carreño and Torrecilla, 2023).
Experiment Setup Yes Algorithm 1 GD for Fréchet regression 1: Input: covariates {Xi}n i=1, responses {Qi}n i=1, ρ 0, n 1 , covariate x, learning rate η, initialization S0, maximum number of iterations T, threshold eps. ... For the initialization S0, optimization over the Euclidean space typically starts near the origin (Chen et al., 2019; Ye and Du, 2021). However, since the space of symmetric positive definite (SPD) matrices, S++ d , is nonlinear, the natural counterpart of the origin in this space is the identity matrix Id. Therefore, we initialize at S0 = Id. ... For the step size η, Altschuler et al. (2021) observed through numerical simulations that while the convergence rate of Euclidean gradient descent is highly sensitive to its step size, Riemannian gradient descent requires no tuning and works effectively with η = 1 when computing the Bures-Wasserstein barycenter. In our simulations, we also find that η = 1 performs at least as well as (and often better than) smaller step sizes. ... setting η = 1, T = 30 and eps = 10 6 in Algorithm 1.