Smoothed Nonparametric Derivative Estimation using Weighted Difference Quotients
Authors: Yu Liu, Kris De Brabanter
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In section 4, we conduct Monte Carlo experiments to compare the proposed methodology with smoothing splines and local polynomial regression. |
| Researcher Affiliation | Academia | Yu Liu EMAIL Department of Computer Science Iowa State University Ames, IA 50011, USA Kris De Brabanter EMAIL Department of Statistics Department of Industrial Manufacturing & Systems Engineering Iowa State University Ames, IA 50011, USA |
| Pseudocode | No | The paper describes procedures and mathematical derivations but does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | No | The paper mentions using third-party R packages like 'ks', 'locpol', and 'pspline' but does not provide concrete access to source code developed specifically for the methodology described in this paper by the authors. There is no explicit statement of code release or a link to a repository for their own implementation. |
| Open Datasets | No | The paper primarily uses simulated data generated from mathematical functions and distributions (e.g., 'm(X) = cos^2(2πX) + log(4/3 + X) for X ~ U(0, 1)', 'm(X) = 50e^(-8(1-2X)^4(1-2X)) for X ~ beta(2, 2)'). It does not specify or provide access information for any publicly available or open external datasets. |
| Dataset Splits | No | The paper uses simulated data generated for Monte Carlo experiments, describing sample sizes (e.g., 'n = 1000' or 'n = 700') and the number of repetitions ('100 times'). However, it does not describe specific training, validation, or test dataset splits in the conventional sense, as data is generated anew for each simulation run rather than being split from a fixed dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to conduct the simulation experiments. |
| Software Dependencies | Yes | In all simulations, we estimate the density f and distribution F using kernel methods (R package ks (Duong, 2018)). The tuning parameter k is selected based on Corollary 5 over a positive integer set {1, 2, . . . , 499}. We use local cubic regression (p = 3) with bimodal kernel to initially smooth the data. Bandwidths h were selected from the set {0.04, 0.045, . . . , 0.1} for both (23) and (24) and corrected for a unimodal Gaussian kernel. Next, we compare the proposed methodology with several popular methods for nonparametric derivative estimation, i.e. the local slope of the local polynomial regression with p = 2, p = 3 (R package locpol (Ojeda, 2012)) and penalized smoothing splines (R package pspline (Ramsey and Ripley, 2017)). |
| Experiment Setup | Yes | The tuning parameter k is selected based on Corollary 5 over a positive integer set {1, 2, . . . , 499}. We use local cubic regression (p = 3) with bimodal kernel to initially smooth the data. Bandwidths h were selected from the set {0.04, 0.045, . . . , 0.1} for both (23) and (24) and corrected for a unimodal Gaussian kernel. The sample size for both models is n = 1000 with e ~ N(0, 0.1^2) and e ~ N(0, 2^2) for (23) and (24) respectively. For the Monte Carlo study, we constructed data sets of size n = 700 and generated the function... 100 times according to model (2) with e ~ N(0, 0.2^2). Bandwidths were selected from the set {0.03, 0.035, . . . , 0.07} and corrected for a unimodal Gaussian kernel. In this simulation, we choose the gird search space of (k1, k2) to be {1, 2, . . . , 100} x {1, 2, . . . , 100} for all models. Bandwidths h are selected from the set {0.05, 0.055, . . . , 0.1} for both functions (23) and (27). |