Optimal Estimation of Derivatives in Nonparametric Regression
Authors: Wenlin Dai, Tiejun Tong, Marc G. Genton
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct simulation studies to assess the finite sample performance of the proposed estimators, ˆm(p) q , and make comparisons with the empirical estimator, ˆm(p) emp, in De Brabanter et al. (2013) and the least squares estimator, ˆm(p) lse , in Wang and Lin (2015). ... The mean absolute error (MAE) is used as a measure of estimation accuracy. ... The simulation results for w = 2 are reported as box-plot figures. |
| Researcher Affiliation | Academia | Wenlin Dai EMAIL CEMSE Division King Abdullah University of Science and Technology Saudi Arabia Tiejun Tong EMAIL Department of Mathematics Hong Kong Baptist University Hong Kong Marc G. Genton EMAIL CEMSE Division King Abdullah University of Science and Technology Saudi Arabia |
| Pseudocode | No | The paper describes methods and proofs primarily through mathematical equations and textual descriptions, without presenting any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository or mention code in supplementary materials for the methodology described. |
| Open Datasets | No | The paper generates synthetic data for simulations, stating 'random errors are generated from a Gaussian distribution, N(0, 0.12)' and defining a regression function 'm(x) = p x(1 x) sin{2.1π/(x + 0.05)}'. It does not use or provide access to any external publicly available datasets. |
| Dataset Splits | No | The paper describes generating synthetic data for simulations using 'n = 100 and 500 sample sizes' and defines how 'design points' are set (xi = i/n). It also defines 'interior (Int) and boundary (Bd) areas' for evaluation based on k0 = [n/10] of the design points. However, it does not specify explicit training, validation, or test dataset splits in the conventional sense, as all data is generated for simulation. |
| Hardware Specification | No | The paper describes conducting 'extensive simulation studies' but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run these experiments. |
| Software Dependencies | Yes | Here, ˆm(q)(xi) (1 + k0 i n k0) are calculated with the function locpol in the R package locpol (Ojeda Cabrera, 2012) with the parameter deg = q + 2. |
| Experiment Setup | Yes | We consider the following regression function, m(x) = 5 sin(wπx), with ω = 1, 2, 4 corresponding to different levels of oscillations. The n = 100 and 500 sample sizes are investigated. We set the design points as xi = i/n and generate the random errors, εi, independently from N(0, σ2). For each regression function, we consider σ = 0.1, 0.5 and 2... Throughout the simulation, we set k0 = [n/10]... We select the sequence order r from O = {2i : 1 i k0}. We choose the bias-reduction level, q, from Q = {p + 2, p + 4, p + 6}.... For each run of the simulation, we compute the MAE of the estimators at both Int and Bd and repeat the procedure 1000 times for each setting. |