Neural Control and Certificate Repair via Runtime Monitoring

Authors: Emily Yu, Đorđe Žikelić, Thomas A. Henzinger

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach empirically by using it to repair and to boost the safety rate of neural network policies learned by a state-of-the-art method for learning-based control on two autonomous system control tasks.
Researcher Affiliation Academia Emily Yu1, Đorđe Žikelić2, Thomas A. Henzinger1 1Institute of Science and Technology Austria, Klosterneuburg, Austria 2Singapore Management University, Singapore, Singapore EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1: Policy Repair. Monitoring-based neural network policy and certificate repair for safety properties.
Open Source Code No We implemented a prototype of our method in Python 3.6 as an extension of the SABLAS codebase (Qin, Sun, and Fan 2022). The paper mentions extending a codebase, but does not state that their specific code for this work is open-source or provide a link.
Open Datasets Yes We consider the benchmarks from the SABLAS codebase, together with policies and certificate functions learned by the SABLAS method for these benchmarks. We then apply our method to these control policies and certificate functions in order to experimentally evaluate the ability of our method to detect incorrect behaviors and to repair them. The first benchmark concerns a parcel delivery drone flying in a city (called the active drone)... The second benchmark Ship Env concerns a ship moving in a river among 32 other ships.
Dataset Splits No For each monitored execution, N = 2000 and N = 1200 states are observed for Ship Env and Drone Env, respectively, spaced out at time intervals of t = 0.1s. In each environment, the states of the eight nearest obstacles (i.e. drones or ships) are given as observations to the policy, as well as the certificate functions. The paper describes parameters for data observation during monitoring (N, t) and counts of initial states or executions (D), but not explicit training/validation/test splits of a static dataset.
Hardware Specification No We implemented a prototype of our method in Python 3.6 as an extension of the SABLAS codebase (Qin, Sun, and Fan 2022). The paper does not specify any hardware used for running experiments, such as CPU, GPU models, or cloud resources.
Software Dependencies Yes We implemented a prototype of our method in Python 3.6 as an extension of the SABLAS codebase (Qin, Sun, and Fan 2022).
Experiment Setup Yes For each monitored execution, N = 2000 and N = 1200 states are observed for Ship Env and Drone Env, respectively, spaced out at time intervals of t = 0.1s. In addition, if the Pred PM monitor is to be used, the algorithm also takes as input the three thresholds ξU, ξS, and ξN. ... We conducted an analogous experiment with a certificate function consisting both of a barrier and a Lyapunov function and again observed significant improvements upon repair. Further details, including results on additional threshold configurations of Pred PM beyond the three configurations discussed in Table 1, can be found in the extended version (Yu, Zikelic, and Henzinger 2024).