reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Unbiased estimators for random design regression

Authors: Michał Dereziński, Manfred K. Warmuth, Daniel Hsu

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For each estimator we plotted the loss LD(bw) for a range of sample sizes k, contrasted with the loss of the best leastsquares estimator w computed from all data. Plots shown in Figure 6.2 were averaged over 100 runs, with shaded area representing standard error of the mean. We used six benchmark datasets from the libsvm repository (Chang and Lin, 2011), whose dimensions are given in Table 6.1.
Researcher Affiliation	Collaboration	Micha l Derezi nski EMAIL Department of Electrical Engineering & Computer Science, University of Michigan Manfred K. Warmuth EMAIL UC Santa Cruz and Google Inc. Daniel Hsu EMAIL Department of Computer Science, Columbia University
Pseudocode	Yes	Algorithm 1 Distortion-free intermediate sampling; Algorithm 2 Reverse iterative sampling (Derezi nski and Warmuth, 2018)
Open Source Code	No	The paper does not provide concrete access to source code or explicitly state that the code is open-source or provided in supplementary materials.
Open Datasets	Yes	We used six benchmark datasets from the libsvm repository (Chang and Lin, 2011), whose dimensions are given in Table 6.1.
Dataset Splits	No	The paper mentions evaluating estimators for a range of sample sizes and averaging results over runs, but does not provide specific train/test/validation splits for the datasets used in experiments.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies, including library or solver names with version numbers, used to replicate the experiments.
Experiment Setup	Yes	For each estimator we plotted the loss LD(bw) for a range of sample sizes k, contrasted with the loss of the best leastsquares estimator w computed from all data. Plots shown in Figure 6.2 were averaged over 100 runs, with shaded area representing standard error of the mean.