KIPPO: Koopman-Inspired Proximal Policy Optimization
Authors: Andrei Cozma, Landon Harris, Hairong Qi
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate consistent improvements over the PPO baseline with 6 60% increased performance while reducing variability by up to 91% when evaluated on various continuous control tasks. |
| Researcher Affiliation | Academia | Andrei Cozma , Landon Harris and Hairong Qi University of Tennessee, Knoxville EMAIL, EMAIL |
| Pseudocode | No | We refer readers to the supplementary materials for complete implementation details and pseudocode. |
| Open Source Code | Yes | Extended version with comprehensive appendices containing ablation studies, hyperparameter analyses, pseudocode, and implementation details is available at: https://andreicozma.com/KIPPO. |
| Open Datasets | Yes | We evaluate six continuous control environments from Gymnasium [Towers et al., 2023] using Mu Jo Co [Todorov et al., 2012] and Box2D [Catto, 2007] |
| Dataset Splits | No | The paper does not describe traditional training/test/validation dataset splits for static datasets, as it uses reinforcement learning environments where data is generated dynamically through interaction. It mentions mini-batches for optimization: "The algorithm divides 2,048 steps into 32 mini-batches". |
| Hardware Specification | No | Hardware specifications and reference runtime are provided in the supplementary material. |
| Software Dependencies | No | The paper mentions using "PPO and RPO implementations from the Clean RL library [Huang et al., 2022]" but does not specify version numbers for Clean RL or other software components. |
| Experiment Setup | Yes | Each rollout phase collects 2,048 environment steps across multiple trajectories... The algorithm divides 2,048 steps into 32 mini-batches... The optimization process runs for 10 epochs... Each training run consists of exactly 1 million environment steps. |