reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Predictive Control and Regret Analysis of Non-Stationary MDP with Look-ahead Information

Authors: Ziyi Zhang, Yorie Nakahira, Guannan Qu

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical analysis demonstrates that, under certain assumptions, the regret decreases exponentially as the look-ahead window expands. When the system prediction is subject to error, the regret does not explode even if the prediction error grows sub-exponentially as a function of the prediction horizon. We validate our approach through simulations and confirm its efficacy in non-stationary environments. ... 6 Simulation
Researcher Affiliation	Academia	Ziyi Zhang EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Yorie Nakahira EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Guannan Qu EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University
Pseudocode	Yes	Algorithm 1 Model predictive dynamical programming (MPDP) 1: Select v(0) Rn, specify ϵ > 0, and set S = 0. 2: for t = 0, 1, 2, . . . , T do 3: Forcast ˆPt, . . . , ˆPt+k, ˆrt, . . . , ˆrt+k 4: Select at according to equation 10. 5: st+1 Pt( \|st, at).
Open Source Code	No	The paper does not contain an explicit statement about releasing source code, nor does it provide a link to a code repository.
Open Datasets	No	In the first simulation, we simulate a queueing system based on the setup provided in Example 1. ... In this section, we consider a scenario of EV charging station under the setup of Example 2 with time horizon T = 50.
Dataset Splits	No	For each k {1, . . . , 15}, we run 20 trials and record the average regret for each k value. The paper describes simulations and trials, but does not mention specific train/test/validation splits for any dataset.
Hardware Specification	No	The paper mentions running 'simulations' but does not specify any particular hardware (e.g., GPU, CPU models, memory) used for these simulations.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	Specifically, we consider a representative example of 3 servers whose service rates {µi}i=1,2,3 are 100, 10, 1, respectively, with time horizon T = 100 and varying load λt fluctuating from 10 to 100. ... The agent has access to the predicted arrival rate of jobs with some Gaussian additive prediction error ˆλt := λt +N(0, σ) with σ {0, 1, 2}. ... In this section, we consider a scenario of EV charging station under the setup of Example 2 with time horizon T = 50. The charging station has three charging stands, and the energy price fluctuates between 2 and 18.