reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

Authors: Thibault De Surrel, Fabien Lotte, Sylvain Chevallier, Florian Yger

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic and real-world datasets demonstrate the robustness and flexibility of this geometry-aware distribution, underscoring its potential to advance manifold-based data analysis. This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.
Researcher Affiliation	Academia	1LAMSADE, CNRS, PSL Univ. Paris-Dauphine, France 2Inria center at the University of Bordeaux / La BRI, France 3TAU, LISN, University Paris-Saclay, France 4LITIS, INSA Rouen-Normandy, France.
Pseudocode	Yes	Algorithm 1 Sampling from a Wrapped Gaussian WG(p; µ, Σ) Require: p Pd, µ R d(d+1)/2, Σ Pd(d+1)/2 1: Sample t N(µ, Σ) 2: Compute X Expp(Vect 1 p (t)) 3: Return X WG(p; µ, Σ)
Open Source Code	Yes	The codes for the different experiments is available at https://github.com/ thibaultdesurrel/wrapped_gaussians_SPD.
Open Datasets	Yes	BNCI2014004 BCI 3 3 720 x 9 subjects 2 (Leeb et al., 2007) Zhou2016 BCI 5 5 320 x 4 subjects 2 (Zhou et al., 2016) Air Quality Atmospheric data 6 6 102 3 (Smith et al., 2022) Indiana Pines Hyperspectral imaging 5 5 14, 641 12 (Baumgardner et al., 2015) Pavia Univ. Hyperspectral imaging 5 5 185, 176 6 Salinas Hyperspectral imaging 5 5 94, 184 17 Textile Image Analysis 10 10 16, 000 2 (Bergmann et al., 2021) Breizh Crops Multispectral imaging 13 13 177, 658 6 (Rußwurm et al., 2020)
Dataset Splits	Yes	The experiment we lead was cross-subject: each classifier was trained on all subject except one and tested on this last subject. For the non-BCI datasets (Air Quality, Indiana, Pavia Uni, Salinas, Textile and Breizh Crops), we used a 5-fold cross-validation to evaluate the performance of the classifiers.
Hardware Specification	No	The paper does not explicitly mention any specific hardware used for running its experiments, such as GPU models, CPU models, or cloud resources with specifications.
Software Dependencies	No	We implemented this MLE in Python using the toolbox Pymanopt (Townsend et al., 2016). We used the library MOABB (Aristimunha et al., 2023) to load and preprocess the data. The TS-LDA uses the Tangent Space class from Py Riemann (Barachant et al., 2024) and the LDA from Scikit-learn (Pedregosa et al., 2011).
Experiment Setup	Yes	We chose relatively small values for X and s because otherwise, when the dimension d is large, the generated parameters are very far from identity leading to numerical instability. More details on the experimental setup are given in Appendix I. ... For p , we use the function generate random spd matrix ... We set X = 0.1Id and s = 1. For µ , we generate a random vector of size d(d+1)/2 with values in [0, 0.1]. For Σ , we generate a random SPD matrix using the same function as for p with X = 0.01Id(d+1)/2 and s = 0.02. ... We start by applying a standard band-pass filter with range [7; 35] Hz. Then, we used the Ledoit-Wolf shrunk covariance matrix ... First, we normalize the data by subtracting the image global mean. Then, we apply a PCA to reduce the dimension of the data to 5. A sliding window with no overlap is then used around each pixel for data sampling and then vectorized. In our experiments, we used a window of size 25 25. ... To optimize the MLE, we used in practice the Riemannian Conjugate Gradient method (Boumal, 2023) with a maximum of 1, 000, 000 iterations and a max time set to 2 hours.