reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Tight Dimensionality Reduction for Sketching Low Degree Polynomial Kernels

Authors: Michela Meister, Tamas Sarlos, David Woodruff

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Tensorized Random Projections in three different applications. In Section 4.1 we show that Tensorized Random Projections always succeed with high probability while Tensor Sketch always fails on extremely sparse inputs. Then in Section 4.2 we observe that Tensor Sketch and Tensorized Random Projections approximate non-linear SVMs with polynomial kernels equally well. Finally in Section 4.3 we demonstrate that Random Projections and Tensorized Random Projections are equally effective in reducing the number of parameters in a neural network while Tensorized Random Projections are faster to compute.
Researcher Affiliation	Collaboration	Michela Meister Cornell University Ithaca, NY 14850 EMAIL Tamas Sarlos Google Research Mountain View, CA 94043 EMAIL David P. Woodruff Department of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 EMAIL
Pseudocode	No	The paper describes methods and analyses in prose but does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for the experiments is available at https://github.com/google-research/google-research/ tree/master/poly_kernel_sketch.
Open Datasets	Yes	We approximate the polynomial kernel x, y 2 for the Adult [19] and MNIST [32] datasets, by applying one of the above three sketches to the dataset.
Dataset Splits	No	The paper mentions training models and datasets (e.g., 'train a linear SVM', 'Adult and MNIST datasets') but does not provide specific train/validation/test split percentages, sample counts, or a detailed splitting methodology.
Hardware Specification	No	The paper does not provide specific hardware details (such as GPU/CPU models, memory, or processor types) used for running the experiments.
Software Dependencies	No	The paper mentions software like LIBLINEAR, LIBSVM, and TensorFlow but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup	No	The paper indicates that model specifics can be found in external TensorFlow tutorials and does not provide concrete hyperparameter values or detailed training configurations within the main text.