reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Geometry-Aware visualization of high dimensional Symmetric Positive Definite matrices

Authors: Thibault de Surrel, Sylvain Chevallier, Fabien Lotte, Florian Yger

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed experiments on controlled synthetic dataset to ensure that the lowdimensional representation preserves the geometric properties of both SPD Gaussian and geodesics. We also conduct experiments on various real datasets, such as video, anomaly detection, brain signal and others. In Section 7, we test our algorithms on synthetic datasets and real-life datasets from 4 different applications.
Researcher Affiliation	Academia	Thibault de Surrel EMAIL LAMSADE, CNRS, PSL Univ. Paris-Dauphine, Paris, France Sylvain Chevallier EMAIL LISN, University Paris-Saclay, Gif-sur-Yvette, France Fabien Lotte EMAIL Inria center at the University of Bordeaux, La BRI, Talence, France Florian Yger EMAIL LITIS INSA Rouen-Normandy, Rouen, France
Pseudocode	No	The paper describes the mathematical formulations and gradient updates for Riemannian MDS and Riemannian t-SNE in Section 4 and Appendix C, but it does not include any clearly labeled pseudocode blocks or algorithm figures.
Open Source Code	Yes	We used py Riemann (Barachant et al., 2023) to deal with SPD matrices in Python1. 1Our code is available at https://github.com/thibaultdesurrel/riemannien_dimension_reduction
Open Datasets	Yes	Setup After testing our algorithms on synthetic datasets, we now study their behavior on 6 real datasets from different applications that are summarized in Table 1. More details on the datasets as well as on the setup are given in Appendix G. We used the different algorithms presented in Sec. 7.1 to reduce the datasets. Table 1: Summary of the datasets used for the experiments. Name Domain Number of matrices Dimension Ref. BNCI2014001 Brain Computer Interfaces 288 9 subjects 22 22 (Tangermann et al., 2012) BNCI2014002 Brain Computer Interfaces 160 12 subjects 15 15 (Steyrl et al., 2015) Alex MI Brain Computer Interfaces 60 8 subjects 16 16 (Barachant, 2012) Air Quality Atmospheric pollutants 102 6 6 (Hua et al., 2021; Smith et al., 2022a) FPHA Video sequences of hand actions 108 63 63 (Garcia-Hernando et al., 2018; Wang et al., 2023) TEP Anomaly detection 420 52 52 (Downs & Vogel, 1993; Smith et al., 2022b)
Dataset Splits	No	The paper describes the number of subjects/matrices and classes for each dataset but does not provide specific details on how the data was split into training, validation, or test sets. For example, it mentions "The two classes of each dataset are balanced" for BCI datasets, but no train/test/validation splits are specified.
Hardware Specification	Yes	The computations were made on a Macbook Pro M3 with 36 Go of memory.
Software Dependencies	Yes	We used py Riemann (Barachant et al., 2023) to deal with SPD matrices in Python1. 1Our code is available at https://github.com/thibaultdesurrel/riemannien_dimension_reduction. pyriemann/pyriemann: v0.5, 2023. We have checked numerically this gradient using the tools in Py Manopt (Townsend et al., 2016). For the Euclidean t-SNE, we chose a fixed perplexity of 30 as it is the default parameter for the t-SNE of scikit-learn (Pedregosa et al., 2011).
Experiment Setup	Yes	For the Riemannian t-SNE (AIRM and Log-Euclidean), we chose a perplexity of 3/4N where N is the total number of points. For the Euclidean t-SNE, we chose a fixed perplexity of 30 as it is the default parameter for the t-SNE of scikit-learn (Pedregosa et al., 2011). in practice, we stopped the RGD when the gradient was small enough (< 10^-6).