Fast Regression for Structured Inputs

Authors: Raphael A Meyer, Cameron N Musco, Christopher P Musco, David Woodruff, Samson Zhou

ICLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide empirical evidence to validate our core statistical claim about Vandermonde regression: that the relative error ε achieved by ℓp/2r Lewis Weight subsampling is polynomial in p, and not exponential in p.
Researcher Affiliation Academia Raphael A. Meyer New York University EMAIL Cameron Musco University of Massachusetts Amherst EMAIL Christopher Musco New York University EMAIL David P. Woodruff Carnegie Mellon University EMAIL Samson Zhou Carnegie Mellon University EMAIL
Pseudocode Yes Fig. 1: Iterative algorithm for approximate ℓp Lewis weights with p < 4. Algorithm 1 Round Trunc: Round and truncate coordinates of input vector b Algorithm 2 Faster regression for Vandermonde matrices Algorithm 3 Faster ℓp regression for general matrices
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets No To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Specifically, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b.
Dataset Splits No The paper describes generating synthetic data for experiments but does not mention specific train/validation/test dataset splits.
Hardware Specification Yes Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Software Dependencies Yes Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Experiment Setup Yes To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Specifically, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b. ... In Figure 2a, we fix m = 1000 and vary p [2, 25]. ... in Figure 2b, we fix p = 6 and vary m [100, 3000]. ... For this test, we fix n = 25, 000, d = 10, and p = 6, while varying m [1, 1000].