Unbiased estimators for random design regression
Authors: Michał Dereziński, Manfred K. Warmuth, Daniel Hsu
JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For each estimator we plotted the loss LD(bw) for a range of sample sizes k, contrasted with the loss of the best leastsquares estimator w computed from all data. Plots shown in Figure 6.2 were averaged over 100 runs, with shaded area representing standard error of the mean. We used six benchmark datasets from the libsvm repository (Chang and Lin, 2011), whose dimensions are given in Table 6.1. |
| Researcher Affiliation | Collaboration | Micha l Derezi nski EMAIL Department of Electrical Engineering & Computer Science, University of Michigan Manfred K. Warmuth EMAIL UC Santa Cruz and Google Inc. Daniel Hsu EMAIL Department of Computer Science, Columbia University |
| Pseudocode | Yes | Algorithm 1 Distortion-free intermediate sampling; Algorithm 2 Reverse iterative sampling (Derezi nski and Warmuth, 2018) |
| Open Source Code | No | The paper does not provide concrete access to source code or explicitly state that the code is open-source or provided in supplementary materials. |
| Open Datasets | Yes | We used six benchmark datasets from the libsvm repository (Chang and Lin, 2011), whose dimensions are given in Table 6.1. |
| Dataset Splits | No | The paper mentions evaluating estimators for a range of sample sizes and averaging results over runs, but does not provide specific train/test/validation splits for the datasets used in experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies, including library or solver names with version numbers, used to replicate the experiments. |
| Experiment Setup | Yes | For each estimator we plotted the loss LD(bw) for a range of sample sizes k, contrasted with the loss of the best leastsquares estimator w computed from all data. Plots shown in Figure 6.2 were averaged over 100 runs, with shaded area representing standard error of the mean. |