reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Robust exploration in linear quadratic reinforcement learning

Authors: Jack Umenberger, Mina Ferizbegovic, Thomas B. Schön, Håkan Hjalmarsson

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.
Researcher Affiliation	Academia	Jack Umenberger Department of Information Technology Uppsala University, Sweden EMAIL Mina Ferizbegovic School of Electrical Engineering and Computer Science KTH, Sweden EMAIL Thomas B. Schön Department of Information Technology Uppsala University, Sweden EMAIL Håkan Hjalmarsson School of Electrical Engineering and Computer Science KTH, Sweden EMAIL
Pseudocode	Yes	Algorithm 1 Receding horizon application to true system
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper uses data obtained from simulations and a physical servo mechanism, which is custom-generated and not a publicly available dataset with concrete access information.
Dataset Splits	No	The paper describes how initial data is obtained and used for trials, but it does not specify explicit training, validation, and test dataset splits.
Hardware Specification	Yes	for a hardware-in-the-loop simulation comprised of the interconnection of a physical servo mechanism (Quanser QUBE 2) and a synthetic (simulated) LTI dynamical system.
Software Dependencies	No	The paper mentions techniques like convex optimization and semideﬁnite programing, but it does not specify any particular software libraries, tools, or their version numbers that were used.
Experiment Setup	Yes	We partition the time horizon T = 10^3 into N = 10 equally spaced intervals, each of length Ti = 100. For robustness, we set δ = 0.05. with look-ahead horizon h = 10. The total control horizon was T = 1250 (2.5 seconds at 500Hz) and was divided into N = 5 intervals, each of duration 0.5 seconds.