reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust Estimation of Derivatives Using Locally Weighted Least Absolute Deviation Regression

Authors: WenWu Wang, Ping Yu, Lu Lin, Tiejun Tong

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct simulation studies to demonstrate that the proposed method has better performance than the existing methods in the presence of outliers and heavy-tailed errors, and analyze the Chinese house price data for the past ten years to illustrate the usefulness of the proposed method.
Researcher Affiliation	Academia	Wen Wu Wang EMAIL School of Statistics Qufu Normal University Jingxuan West Road, Qufu, Shandong, China Ping Yu EMAIL Faculty of Business and Economics University of Hong Kong Pokfulam Road, Hong Kong Lu Lin EMAIL Zhongtai Securities Institute for Financial Studies Shandong University Jinan, Shandong, China Tiejun Tong EMAIL Department of Mathematics Hong Kong Baptist University Kowloon Tong, Hong Kong
Pseudocode	No	The paper presents mathematical derivations, theorems, and proofs but does not include any explicitly labeled pseudocode or algorithm blocks in a structured format.
Open Source Code	No	The paper mentions using existing R packages like 'L1pack in Osorio (2015)', 'locpol in Cabrera (2012)', and 'pspline in Ramsay and Ripley (2013)', but it does not state that the authors' own implementation code for the proposed methodology is made publicly available.
Open Datasets	Yes	In this section, we apply RLow LAD to the data set of house price in two cities of China, i.e., Beijing and Jinan. We collect these monthly data from the web: http://www.creprice.cn/
Dataset Splits	No	For simulation studies, the paper generates data of size 300 or 100/500 with specified error distributions. For the empirical application, it uses monthly data from January 2008 to July 2018 (size 127). However, no explicit training, validation, or test dataset splits are mentioned for either the synthetic or real-world data.
Hardware Specification	No	Due to the heavy computation (for example, it needs more than 48 hours for the case n = 500 and k = n/5 = 100 based on 1000 repetitions on our personal computer), we choose k = n/5 uniformly. The mention of 'personal computer' is too general and does not provide specific hardware details like CPU or GPU models, processor types, or memory.
Software Dependencies	Yes	Figure 1 (2) displays the Low LAD (RLow LAD) estimator (use the R package L1pack in Osorio (2015)) and the Low LSR estimator with k {6, 12, 25, 30, 50}. We further compare Low LAD with two other well-known methods: the local polynomial regression with p = 5 (use R package locpol in Cabrera (2012)) and the penalized smoothing splines with norder = 6 and method = 4 (use R package pspline in Ramsay and Ripley (2013)).
Experiment Setup	Yes	We consider two sample sizes: n = 100 and 500, two standard deviations: σ = 0.1 and 0.5, two contaminated standard deviations: σ0 = 1 and 10, two frequencies: f = 0.5 and 1, and two amplitudes: A = 1 and 10. We use the following adjusted mean absolute error (AMAE) as the criterion of performance evalution: AMAE(k) = 1 n 2k ∑m(1)(xi) − m(1)(xi)\|. We choose k = n/5 uniformly.