reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation

Authors: Kush Bhatia, Aldo Pacchiano, Nicolas Flammarion, Peter L. Bartlett, Michael I. Jordan

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here we illustrate the practical utility of Gen-Oja on a synthetic, streaming generalized eigenvector problem. We take d = 20 and T = 10^6. The streams (At, Bt) (Rd d)^2 are normally-distributed with covariance matrix A and B with random eigenvectors and eigenvalues decaying as 1/i, for i = 1, . . . , d. Here R^2 denotes the radius of the streams with R^2 = max{Tr A, Tr B}. All results are averaged over ten repetitions. Comparison with two-steps methods. In the left plot of Figure 1 we compare the behavior of Gen-Oja to different two-steps algorithms.
Researcher Affiliation	Academia	Kush Bhatia University of California, Berkeley EMAIL Aldo Pacchiano University of California, Berkeley EMAIL Nicolas Flammarion University of California, Berkeley EMAIL Peter L. Bartlett University of California, Berkeley EMAIL Michael I. Jordan University of California, Berkeley EMAIL
Pseudocode	Yes	Algorithm 1: Gen-Oja for Streaming Av = λBv
Open Source Code	No	The paper does not provide any explicit statements about open-source code availability or links to code repositories.
Open Datasets	No	The paper uses a synthetic dataset generated for the simulations: "The streams (At, Bt) (Rd d)^2 are normally-distributed with covariance matrix A and B with random eigenvectors and eigenvalues decaying as 1/i, for i = 1, . . . , d." It does not provide access information for a publicly available or open dataset as it's generated for the experiment.
Dataset Splits	No	The paper describes a streaming problem with synthetic data generation; it does not mention traditional training, validation, or test splits. The evaluation is conducted on the generated streams.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	Fix any δ > 0 and ϵ1 > 0. Suppose that the step sizes are set to αt = c log(d2β+t) and βt = γ λ(d2β+t) for γ > 1/2 , c > 1 and ... In the middle plot of Figure 1 we compare the behavior of Gen-Oja for step size α {α , α /8, α /16} where α = 1/R2. We observe that Gen-Oja converges at a rate O(1/t) independently of the choice of α. Robustness to incorrect step-size βt. In the right plot of Figure 1 we compare the behavior of Gen-Oja for step size βt {β /t, β /16t, β /i} where β corresponds to the minimal error after one pass over the data.