Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1]
Koopman Spectrum Nonlinear Regulators and Efficient Online Learning
Authors: Motoya Ohnishi, Isao Ishikawa, Kendall Lowrey, Masahiro Ikeda, Sham M. Kakade, Yoshinobu Kawahara
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate how one can use the costs in Example 3.1. See Appendix H for detailed descriptions and results of the experiments. Throughout, we used Julia language (Bezanson et al., 2017) based robotics control package, Lyceum (Summers et al., 2020), for simulations and visualizations. Also, we use Cross Entropy Method (CEM) based policy search (Kobilarov (2012); one of the population based policy search techniques) to optimize the policy parameter Θ to minimize the cost in (3.2). Figure 4 (Left) plots the trajectories (of the Cartesian coordinates) generated by RFF policies that minimize this cost; it is observed that the agent successfully converged to the desired limit cycle of radius one by imitating the dominant mode of the target spectrum. Figure 4 (Right) plots the cart velocity trajectories generated by RFF policies that (approximately) solve KSNR with/without the spectrum cost. It is observed that spectral regularization led to a back-and-forth motion while the non-regularized policy preferred accelerating to one direction to solely maximize velocity. |
| Researcher Affiliation | Collaboration | Motoya Ohnishi EMAIL Paul G. Allen School of Computer Science & Engineering University of Washington Isao Ishikawa EMAIL Ehime University RIKEN Center for Advanced Intelligence Project Kendall Lowrey EMAIL Et Cetera Robotics Masahiro Ikeda EMAIL RIKEN Center for Advanced Intelligence Project Keio University Sham Kakade EMAIL Harvard University Yoshinobu Kawahara EMAIL Graduate School of Information Science and Technology, Osaka University RIKEN Center for Advanced Intelligence Project |
| Pseudocode | Yes | Algorithm 1 Koopman-Spectrum LC3 (KS-LC3) Require: Parameter set Π; regularizer λ |
| Open Source Code | No | No explicit statement about releasing their own implementation code or a direct link to a repository for their specific methodology. The paper mentions using existing open-source tools like Julia, Lyceum, Open AI Gym, Deep Mind Control Suite, and Mu Jo Co, but does not provide code for the methodology developed in this paper. |
| Open Datasets | Yes | We consider Cartpole environment (where the rail length is extended from the original model). We use the Walker2d environment and compare movements with/without the spectrum cost. Used Deep Mind Control Suite Cartpole environment with modifications... Also, we used a combination of linear and RFF features; the first elements of the feature are simply the observation (state) vector, and the rest are Gaussian RFFs. [16] G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba. Open Ai Gym. arXiv preprint arXiv:1606.01540, 2016. [63] Y. Tassa, Y. Doron, A. Muldal, T. Erez, Y. Li, D. L. Casas, D. Budden, A. Abdolmaleki, J. Merel, A. Lefrancq, et al. Deep Mind Control Suite. arXiv preprint arXiv:1801.00690, 2018. |
| Dataset Splits | No | The paper describes experiments conducted in simulation environments (Cartpole, Walker2d, etc.) where data is generated dynamically through interactions. It mentions |
| Hardware Specification | Yes | Julia Version 1.5.3 Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: AMD Ryzen Threadripper 3990X 64-Core Processor WORD_SIZE: 64 LIBM: libopenlibm LLVM: lib LLVM-9.0.1 (ORCJIT, znver2) Environment: JULIA_NUM_THREADS = 12 Julia Version 1.5.3 Platform Info: OS: Linux (x86_64-pc-linux-gnu) CPU: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz WORD_SIZE: 64 LIBM: libopenlibm LLVM: lib LLVM-9.0.1 (ORCJIT, haswell) Environment: JULIA_NUM_THREADS = 12 |
| Software Dependencies | Yes | Julia Version 1.5.3 Platform Info: The licenses of Julia, Open AI Gym, Deep Mind Control Suite, Lyceum, and Mu Jo Co, are [The MIT License...], and [Mu Jo Co Pro Lab license], respectively. Mu Jo Co version 2.0 is used (license: Mu Jo Co Pro Individual license). |
| Experiment Setup | Yes | Table 1: Hyperparameters used for limit cycle generation. CEM hyperparameter Value Training target Koopman operator Value samples 200 training iteration 500 elite size 20 RFF bandwidth for ϕ 3.0 iteration 50 RFF dimension dϕ 80 planning horizon 80 horizon for each iteration 80 policy RFF dimension 50 policy RFF bandwidth 2.0 Table 2: Hyperparameters used for stable loop generation. Hyperparameters Value Hyperparameters Value samples 200 elite size 20 iteration 100 planning horizon 100 dimension dϕ 50 RFF bandwidth for ϕ 2.0 policy RFF dimension 100 policy RFF bandwidth 2.0 Table 3: Hyperparameters used for Walker. Hyperparameters Value Hyperparameters Value samples 300 elite size 20 iteration 120 planning horizon 300 dimension of dϕ 200 RFF bandwidth for ϕ 5.0 policy RFF dimension 300 policy RFF bandwidth 30.0 Table 4: Hyperparameters used for KS-LC3 in the simple linear case. Hyperparameters Value Hyperparameters Value prior parameter λ 0.05 covariance scale ι 0.0001 planning horizon 50 |