reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Dynamics of the accelerated t-SNE

Authors: Kyoichi Iwasaki, Hideitsu Hino

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through rigorous theoretical analysis and empirical validation, we show that our approach offers a further insights into the dynamical properties of t-SNE. Our main contributions are summarized as follows: We explore an approach of dynamical system analysis of t-SNE including a momentum term, and further extended to the NAG. ... We numerically evaluate our results by comparing the t-SNE algorithms with and without acceleration and their ODE counterparts with real-world datasets: KDDcup1999 and MNIST.
Researcher Affiliation	Academia	Kyoichi Iwasaki EMAIL The Graduate University for Advanced Studies, SOKENDAI; Hideitsu Hino EMAIL The Institute of Statistical Mathematics
Pseudocode	No	The paper describes mathematical equations and theoretical derivations for the dynamics of t-SNE, along with descriptions of algorithms (GD, MM, NAG), but it does not present any structured pseudocode or algorithm blocks.
Open Source Code	No	Although the term acceleration often suggests computational speed-up, we clarify that our current framework focuses on theoretical acceleration in the continuous-time dynamics, rather than practical runtime efficiency. In fact, due to the reliance on full eigendecomposition of the graph Laplacian, the overall computational complexity remains O(n3) in our current implementation. There is no explicit statement or link indicating that the code for the described methodology is publicly available.
Open Datasets	Yes	We numerically evaluate our results by comparing the t-SNE algorithms with and without acceleration and their ODE counterparts with real-world datasets: KDDcup1999 and MNIST. ... KDDCup1999: https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, CC BY 4.0 license referenced by https://archive.ics.uci.edu/dataset/130/kdd+cup+1999+data; MNIST: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_ openml.html, BSD license provided by sklearn; Olivetti Faces: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_olivetti_faces.html, BSD license provided by sklearn
Dataset Splits	No	For the KDDCup1999 dataset... We extracted 100 samples of data for each of the 5 labels ( smurf , neptune , normal , back , satan ) from the dataset, totaling 500 samples. ... for MNIST dataset... we focus on 400 images with 4 labels( 2 , 4 , 6 , 8 ) totaling 1600 samples... The paper describes data selection for visualization but does not specify train/test/validation dataset splits or cross-validation methods.
Hardware Specification	Yes	The experiments were conducted on a laptop PC with a 12th Gen Intel(R) Core(TM) i7-1255U processor, 500GB storage, 16GB memory, using Python.
Software Dependencies	No	The experiments were conducted on a laptop PC with a 12th Gen Intel(R) Core(TM) i7-1255U processor, 500GB storage, 16GB memory, using Python. The paper mentions Python as the programming language but does not specify its version or any specific library versions used.
Experiment Setup	Yes	In this paper, we set perplexity = 30, as in Cai & Ma (2022). In all experiments in this section, all initial embedding vectors were generated randomly. ... The optimization is done with step size parameter h = 5, momentum parameter m = 0.5, Perplexity is 30, and exaggeration parameter α = 10. ... We propose a stopping criterion based on a threshold (e.g., 0.01) and demonstrate its effectiveness in subsection 8.2.