Homogeneity Structure Learning in Large-scale Panel Data with Heavy-tailed Errors

Authors: Di Xiao, Yuan Ke, Runze Li

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive numerical experiments demonstrate the advantage of the proposed learning procedure over conventional methods especially when the data are generated from heavy-tailed distributions.
Researcher Affiliation Academia Di Xiao EMAIL Department of Statistics University of Georgia Athens, GA 30602, USA Yuan Ke EMAIL Department of Statistics University of Georgia Athens, GA 30602, USA Runze Li EMAIL Department of Statistics The Pennsylvania State University University Park, PA 16802, USA
Pseudocode Yes Algorithm 1 Change-points detection with wild binary segmentation Algorithm 2 Robust homogeneity structure learning
Open Source Code No The WBS package is available at https://cran.r-project.org/web/packages/wbs/index.html. The Farm Test package is available at https://cran.r-project.org/web/packages/Farm Test/index.html. These are third-party tools used in the paper, not the authors' own implementation code for the methodology described.
Open Datasets Yes A detailed description of this panel data can be found in Appendix A of Ludvigson and Ng (2009). 5. The data set is available at https://www.epa.gov/outdoor-air-quality-data.
Dataset Splits Yes Start from the first day in the data set, we use a window size of 250 days as the training set to predict the next 50 days. Each time, the window moves 20 days forward.
Hardware Specification Yes The average wall-time running costs of Our, Ah and Mom are 0.8, 11 and 7 seconds per replication, respectively. For each method, we simulate 200 replications on the same computer cluster node with Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz and 256Gb RAM.
Software Dependencies No The WBS package is available at https://cran.r-project.org/web/packages/wbs/index.html. The Farm Test package is available at https://cran.r-project.org/web/packages/Farm Test/index.html. The paper mentions these R packages but does not provide specific version numbers for them or the R environment itself.
Experiment Setup Yes Throughout this section, we set N = 100, T = 200, p = 30 and q = 2. For each scenario, we simulate 200 replications unless otherwise specified. The structures (i) and (ii) have 5 and 9 groups, respectively. For both structures, the signal strength r is set to be 1 (week), 2 (medium), or 4 (strong). In our numerical studies, we choose R = 5000 and ξ = 2 ln T which are the default values in the WBS package. With qmax = 4 and CT = 0.01, the modified ratio method (13) estimates the number of factors to be 1. we propose to use the 5-fold cross-validation to select their robustification parameters in our numerical studies.