reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

High Dimensional Forecasting via Interpretable Vector Autoregression

Authors: William B. Nicholson, Ines Wilms, Jacob Bien, David S. Matteson

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A simulation study demonstrates improved performance in forecasting and lag order selection over previous approaches, and macroeconomic, ﬁnancial, and energy applications further highlight forecasting improvements as well as HLag s convenient, interpretable output.
Researcher Affiliation	Collaboration	William B. Nicholson EMAIL Point72 Asset Management, L.P. New York, USA Ines Wilms EMAIL Department of Quantitative Economics Maastricht University Maastricht, The Netherlands Jacob Bien EMAIL Department of Data Sciences and Operations Marshall School of Business, University of Southern California California, USA David S. Matteson EMAIL Department of Statistics and Data Science Cornell University Ithaca, USA
Pseudocode	Yes	Algorithm 1 General algorithm for HLag with penalty Ω i Algorithm 2 Solving Problem (10)
Open Source Code	Yes	Implementations of our methods are available in the R package Big VAR, which is hosted on the Comprehensive R Archive Network (cran).
Open Datasets	Yes	Our ﬁrst and main application is macroeconomic forecasting (Section 6.1). We apply the proposed HLag methods to a collection of US macroeconomic time series compiled by Stock and Watson (2005) and augmented by Koop (2013). The full data set, publicly available at The Journal of Applied Econometrics Data Archive, contains 168 quarterly macroeconomic indicators over 45 years: Quarter 2, 1959 to Quarter 4, 2007, hence T = 195. We apply the HLag methods to a ﬁnancial data set containing realized variances for k = 16 stock market indices... Daily realized variances based on ﬁve minute returns are taken from Oxford-Man Institute of Quantitative Finance (publicly available on http://realized.oxford-man.ox.ac.uk/data/download). We apply the HLag methods to an energy data set (Candanedo et al., 2017) containing information on k = 26 variables related to in-house energy usage, temperature and humidity conditions. Data are taken from the publicly available UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/data sets/Appliances+energy+prediction).
Dataset Splits	Yes	Given an evaluation period [T1, T2], we use one-step-ahead mean-squared forecast error (MSFE) as a cross-validation score... Quarter 3, 1977 (T1) to Quarter 3, 1992 (T2) is used for penalty parameter selection; Quarter 4, 1992 (T3) to Quarter 4, 2007 (T4) are used for out-of-sample rolling window forecast comparisons.
Hardware Specification	Yes	Average computation times, in seconds on an Intel Core i7-6820HQ 2.70GHz machine including the penalty parameter search... For instance, for the Large VAR (k = 168, T = 195, and 113,064 parameters) estimated on the Stock and Watson data, the HLag methods only require (on an Intel Xean Gold 6126 CPU @ 2.60GHz machine) around 1.5 (Own-Other), 2 (Componentwise) and 3.5 minutes (Elementwise), including penalty parameter selection.
Software Dependencies	No	The paper mentions implementations in R and the use of Matlab code but does not provide specific version numbers for R, Matlab, or any key third-party libraries used in their experiments. The 'Big VAR' R package is their own implementation.
Experiment Setup	Yes	The HLag methods rely on a single tuning parameter λ in equation (8). The grid of penalty values is constructed by starting with λmax, an estimate of the smallest value in which all coeﬃcients are zero, then decrementing in log linear increments. The grid bounds are detailed in the appendix of Nicholson et al. (2017). We estimate four VAR models on this data set: The Small Medium VAR (k = 10)... The Medium VAR (k = 20)... The Medium-Large VAR (k = 40)... The Large VAR (k = 168). We compare the forecast performance of the HLag methods to their competitors on the four VAR models with pmax = 4... Finally, we re-did our forecast exercise for longer forecast horizons h = 4 and h = 8.