Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

Authors: Shusen Wang, Alex Gittens, Michael W. Mahoney

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical evaluations bear out these theoretical results. In particular, in Section 4, we show in Figure 3 that even when the regularization parameter γ is fine-tuned, the risks of classical and Hessian sketch are worse than that of the optimal solution by an order of magnitude. We conduct experiments on synthetic data to verify our theory. Sections 4 and 5 conduct experiments to verify our theories and demonstrates the efficacy of model averaging. We tested the prediction performance of sketched ridge regression by implementing classical sketch with model averaging in Py Spark (Zaharia et al., 2010).
Researcher Affiliation Academia Shusen Wang EMAIL International Computer Science Institute and Department of Statistics University of California at Berkeley Berkeley, CA 94720, USA Alex Gittens EMAIL Computer Science Department Rensselaer Polytechnic Institute Troy, NY 12180, USA Michael W. Mahoney EMAIL International Computer Science Institute and Department of Statistics University of California at Berkeley Berkeley, CA 94720, USA
Pseudocode No The paper describes methods and theoretical analyses, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured, code-like steps for any procedure.
Open Source Code Yes The code is available at https://github.com/wangshusen/Sketched Ridge Regression.git
Open Datasets Yes We use the Million Song Year Prediction data set, which has 463, 715 training samples and 51, 630 test samples with 90 features and one response.
Dataset Splits Yes We use the Million Song Year Prediction data set, which has 463, 715 training samples and 51, 630 test samples with 90 features and one response. We randomly partition the training data into g parts, which amounts to uniform row selection with sketch size s = n/g.
Hardware Specification No The paper states, 'We ran our experiments using Py Spark in local mode,' but it does not specify any particular hardware components such as GPU or CPU models, memory, or other detailed computer specifications used for these experiments.
Software Dependencies No The paper mentions implementing code 'in Py Spark (Zaharia et al., 2010)' and 'in Python', but it does not provide specific version numbers for these or any other software libraries or dependencies, which are necessary for full reproducibility.
Experiment Setup Yes In Figure 2, we plot the objective function value f(w) = 1/n Xw y 2 2 + γ w 2 2 against γ, under different settings of ξ (the standard deviation of the Gaussian noise added to the response). We calculate the bias and variance bias(w ), var(w ) of the optimal MRR solution according to Theorem 4. We consider different noise levels by setting ξ = 10^-2 or 10^-1. We use five-fold cross-validation to determine the regularization parameter γ.