reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fast Regression for Structured Inputs

Authors: Raphael A Meyer, Cameron N Musco, Christopher P Musco, David Woodruff, Samson Zhou

ICLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide empirical evidence to validate our core statistical claim about Vandermonde regression: that the relative error ε achieved by ℓp/2r Lewis Weight subsampling is polynomial in p, and not exponential in p.
Researcher Affiliation	Academia	Raphael A. Meyer New York University EMAIL Cameron Musco University of Massachusetts Amherst EMAIL Christopher Musco New York University EMAIL David P. Woodruff Carnegie Mellon University EMAIL Samson Zhou Carnegie Mellon University EMAIL
Pseudocode	Yes	Fig. 1: Iterative algorithm for approximate ℓp Lewis weights with p < 4. Algorithm 1 Round Trunc: Round and truncate coordinates of input vector b Algorithm 2 Faster regression for Vandermonde matrices Algorithm 3 Faster ℓp regression for general matrices
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	No	To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Speciﬁcally, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b.
Dataset Splits	No	The paper describes generating synthetic data for experiments but does not mention specific train/validation/test dataset splits.
Hardware Specification	Yes	Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Software Dependencies	Yes	Figure 2 shows the result of these experiments, which were run in Julia 1.6.1, on Windows 10 with an Intel i7-7700K CPU and 16Gb RAM.
Experiment Setup	Yes	To validate our theory, we plot log(εempirical) versus p and m on synthetic data. Speciﬁcally, we generate n = 25, 000 i.i.d. N(0, 1) times samples to form a Vandermonde matrix V with d = 20 columns, then compute the polynomial q(t) = t10 at each time sample, add N(0, 1010) additive noise, and save the corresponding values in b. ... In Figure 2a, we ﬁx m = 1000 and vary p [2, 25]. ... in Figure 2b, we ﬁx p = 6 and vary m [100, 3000]. ... For this test, we ﬁx n = 25, 000, d = 10, and p = 6, while varying m [1, 1000].