HopCast: Calibration of Autoregressive Dynamics Models
Authors: Muhammad Bilal Shahid, Cody Fleming
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our contributions are: ... lower calibration and prediction error across several benchmarks, without the use of complex uncertainty propagation techniques. ... The calibration and prediction performances are evaluated across a set of dynamical systems. This work is also the first to benchmark existing uncertainty propagation methods based on calibration errors. We also evaluate Hop Cast as a substitute for Deep Ensembles within a model-based reinforcement learning planner, demonstrating improved performance across multiple control tasks. |
| Researcher Affiliation | Academia | Muhammad Bilal Shahid EMAIL Department of Mechanical Engineering Iowa State University Cody Fleming EMAIL Department of Mechanical Engineering Iowa State University |
| Pseudocode | Yes | Algorithm 1 SL Tuning for Calibration |
| Open Source Code | No | The text provides a link to code for baselines from a previous work (Chua et al., 2018b): "Code available at: https://github.com/kchua/handful-of-trials". However, it does not explicitly state that the code for the methodology described in *this* paper (Hop Cast) is available at this link or any other provided URL. |
| Open Datasets | Yes | Section 5.2 Datasets: We have discussed one dynamical system, i.e., LV, in section 3. Other dynamical systems include Lorenz, Fitz Hugh-Nagumo (FHN), Lorenz95, and the Glycolytic Oscillator. To generate datasets, we randomly sample N initial conditions from within a specified range for the state variables of each system. The solve_ivp method from scipy is used to integrate the dynamics with the adaptive-step RK45 solver. To produce uniformly sampled trajectories, system states are extracted at fixed time intervals t as mentioned in Table 4. For Lorenz and LV, the dynamics of both system states and their derivatives are modeled, whereas the dynamics of states are modeled for the rest of the systems. The mathematical forms of each dynamical system, ranges of initial conditions, and parameter values are given in Appendix B. |
| Dataset Splits | Yes | Once we have SQ, SK, SVx, SVy, they are split into train/test with an 80/20 split. |
| Hardware Specification | Yes | All experiments were run on NVIDIA A100-SXM4-80GB. |
| Software Dependencies | No | The paper mentions software used, such as "Py Torch", "Adam", and "Adam W", but does not provide specific version numbers for these software components. For example, it states "Py Torch is used to implement baselines and Hop Cast" but lacks version details like "PyTorch 1.9". |
| Experiment Setup | Yes | F.1 Baselines Implementation: The batch size, learning rate, optimizer, and epochs were kept the same for all experiments, and are 128, 0.001, Adam (Kingma & Ba, 2017), and 1000, respectively. ... two layers of a fully connected feedforward model with 400 neurons each were used for LV and FHN, and three layers with 400 neurons for Lorenz, Lorenz95, and Glycolytic Oscillator. F.2 Hop Cast Implementation: The Encoder was a fully connected feedforward model with one layer and 100 neurons. ... The deterministic Predictor was a fully connected feedforward model of two layers with 400 neurons each for LV and FHN, and three layers with 400 neurons for Lorenz, Lorenz95, and Glycolytic Oscillator. ... The learning rate of 0.001 and the optimizer Adam W (Loshchilov & Hutter, 2019) were kept the same for all experiments. The batch size and epochs were different for each experiment, and are provided in the form of yml files as a supplementary material along with other hyperparameters for each system and noise scaling factor (σ). Table 3 has SL for each output of all systems at various noise scaling factors σ. |