reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Online Non-stochastic Control with Partial Feedback

Authors: Yu-Hu Yan, Peng Zhao, Zhi-Hua Zhou

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, empirical studies in both synthetic linear and simulated nonlinear tasks validate our method s eﬀectiveness, thus supporting the theoretical ﬁndings. Keywords: online non-stochastic control, partial feedback, dynamic regret, online ensemble, online learning with memory, bandit convex optimization
Researcher Affiliation	Academia	Yu-Hu Yan EMAIL Peng Zhao EMAIL Zhi-Hua Zhou EMAIL National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China
Pseudocode	Yes	Algorithm 1 Base Algorithm Input: Memory H, dimension d, domain K, shrinkage α, perturbation δ, step size η. 1 Initialize the corresponding variables. 2 Initialize w1, . . . , w H, any feasible decisions for the ﬁrst H rounds. for t = H + 1, . . . , T do 3 Submit decision wt and receive cost ft(wt H:t) + εt. 4 Draw a random bit bt Bernoulli(1/H). if t H and bt QH 1 i=1 (1 bt i) = 1 then 5 Estimate the gradient via (4.3). 6 Update the decision via (4.4). else 7 Maintain the decision wt+1 = wt. end end
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	No	The paper describes experiments in "synthetic linear" and "simulated nonlinear environments" such as "pendulum," "cartpole," and "data center cooling." These are either generated data or standard simulation setups, but the paper does not provide concrete access information (e.g., URLs, DOIs, specific citations to public dataset repositories) for any external publicly available datasets.
Dataset Splits	No	The paper describes experiments in synthetic and simulated environments (time-varying linear dynamical system, pendulum, cartpole, data center cooling) and discusses how cost functions change (gradual, abrupt, mixture). It mentions restarting algorithms periodically to handle non-linearity. However, it does not specify explicit training/test/validation dataset splits (e.g., percentages, sample counts, or references to predefined splits) in the traditional sense for empirical evaluation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU models, CPU models, or memory specifications) used for running its experiments. It only mentions general experimental settings without hardware specifics.
Software Dependencies	No	The paper mentions "Open AI Gym" in the context of the cartpole environment. However, it does not provide specific version numbers for Open AI Gym or any other software libraries, frameworks, or programming languages used in the implementation.
Experiment Setup	No	The paper states: "All hyperparameters are set to be theoretically optimal except the learning rate of the meta learners, which are scaled by constants to speed up the learning process." This is a general statement about hyperparameter tuning but does not provide concrete values for any specific hyperparameters (e.g., learning rate, batch size, number of epochs, specific optimizer settings) used in the experiments.