reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Linear Dimensionality Reduction: Survey, Insights, and Generalizations

Authors: John P. Cunningham, Zoubin Ghahramani

JMLR 2015 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 4 validates this claim by applying this generic solver without change to different objectives f X( ), both classic and novel. We require only the condition that f X( ) be differentiable in M to enable simple gradient descent methods. ... Figure 2: Performance comparison between heuristic solvers and direct optimization of linear dimensionality reduction objectives.
Researcher Affiliation	Academia	John P. Cunningham EMAIL Department of Statistics Columbia University New York City, USA Zoubin Ghahramani EMAIL Department of Engineering University of Cambridge Cambridge, UK
Pseudocode	Yes	Algorithm 1 gives pseudocode for a projected gradient method over the Stiefel manifold.
Open Source Code	Yes	We implemented these methods in MATLAB, both natively for ﬁrst order methods, and using the excellent manopt software library (Boumal et al., 2014) for ﬁrst and second order methods (all code is available at http://github.com/cunni/ldr).
Open Datasets	No	In each panel (A and B), we simulated data of dimensionality d = 3, with n = 3000 points, corresponding to 1000 points in each of 3 clusters (shown in black, blue, and red). Data in each cluster were normally distributed with random means (normal with standard deviation 5/2) and random covariance (uniformly distributed orientation and exponentially distributed eccentricity with mean 5). We ran PCA on 20 random data sets for each dimensionality d {4, 8, 16, ..., 1024}, each time projecting onto r = 3 dimensions. Data were normally distributed with random covariance (exponentially distributed eccentricity with mean 2).
Dataset Splits	No	The paper describes generating synthetic data and testing methods on it, but does not mention specific training, validation, or test splits. The focus is on optimizing objectives for dimensionality reduction.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU, GPU models, or memory specifications used for running the experiments.
Software Dependencies	No	We implemented these methods in MATLAB, both natively for ﬁrst order methods, and using the excellent manopt software library (Boumal et al., 2014) for ﬁrst and second order methods (all code is available at http://github.com/cunni/ldr). While MATLAB and Manopt are mentioned, specific version numbers are not provided for either.
Experiment Setup	Yes	We ran PCA on 20 random data sets for each dimensionality d {4, 8, 16, ..., 1024}, each time projecting onto r = 3 dimensions. Data were normally distributed with random covariance (exponentially distributed eccentricity with mean 2). ... We generated data with 1000 data points in each of d classes, where within class data was generated according to a normal distribution with random covariance (uniformly distributed orientation and exponentially distributed eccentricity with mean 5), and each class mean vector was randomly chosen (normal with standard deviation 5/d).