reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multivariate Bayesian Structural Time Series Model

Authors: Jinwen Qiu, S. Rao Jammalamadaka, Ning Ning

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive simulations were run to investigate properties such as estimation accuracy and performance in forecasting. This was followed by an empirical study with one-step-ahead prediction on the max log return of a portfolio of stocks that involve four leading ﬁnancial institutions. Both the simulation studies and the extensive empirical study conﬁrm that this multivariate model outperforms three other benchmark models, viz. a model that treats each target series as independent, the autoregressive integrated moving average model with regression (ARIMAX), and the multivariate ARIMAX (MARIMAX) model.
Researcher Affiliation	Academia	Jinwen Qiu EMAIL S. Rao Jammalamadaka EMAIL Ning Ning EMAIL Department of Statistics and Applied Probability University of California Santa Barbara, CA 93106, USA
Pseudocode	Yes	Algorithm 1 MBSTS Model Training
Open Source Code	No	The paper includes a license for the paper content itself (CC-BY 4.0) and attribution requirements, but does not contain any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	No	The paper mentions using "computer-generated data" for simulations and real-world data "obtained from Google Finance" and "Google Domestic Trends". While Google Domestic Trends is noted as "publicly available", the paper does not provide specific links, DOIs, or citations with author/year information to the exact datasets (both simulated and real-world) in a way that allows concrete access for reproduction.
Dataset Splits	No	The paper states: "The generated data sets were split into a certain period of training data and a subsequent period of testing set. The standard approach would use the training data to develop the model that would then be applied to obtain predictions for the testing period. We use a growing window approach, which simply adds one new observation in the test set to the existing training set, obtaining a new model with fresher data and then constantly forecasting a new value in the test set." This describes a splitting methodology (growing window approach) but does not provide specific percentages, absolute sample counts, or references to predefined splits needed for exact reproduction.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers that would be necessary to replicate the experiments.
Experiment Setup	Yes	After model training, we drew 2000 samples for each coeﬃcient to be estimated during MCMC iterations. To reduce the inﬂuence of initial values on posterior inferences, we discarded an initial portion of the Markov chain samples. Speciﬁcally based on trial and error, the ﬁrst 200 drawn samples were removed and the rest of them were used to build a sample posterior distribution for each parameter. Through cross validation, we ﬁnd the optimal damping factor equals 0.95 in terms of cumulative one-step prediction errors.