Avoiding Undesired Future with Sequential Decisions
Authors: Lue Tao, Tian-Zuo Wang, Yuan Jiang, Zhi-Hua Zhou
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, experimental results confirm the practical effectiveness of the proposed approach in both simulated and real-world tasks. |
| Researcher Affiliation | Academia | Lue Tao , Tian-Zuo Wang , Yuan Jiang and Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, China School of Artifcial Intelligence, Nanjing University, China EMAIL |
| Pseudocode | Yes | Algorithm 1: Multi-Stage Rehearsal Input: Number of stages M Output: Sequence of alterations A 1 Initialize the sequence of alterations A = [ ]. 2 for m 1 to M do 3 Acquire rehearsal model M G, f, p(ϵ) . 4 Make a new observation o on O m . 5 Obtain the updated noise po(ϵ) by incorporating o into p(ϵ) through retrospective inference. 6 Update rehearsal model M G, f, po(ϵ) . 7 Select an alteration Rh(A = a) from A m by minimizing the probability of failure. 8 Obtain the altered graph GA from G by removing the incoming arrows of A in G. 9 Obtain the altered equations f a from f by setting the equation of A to A = a. 10 Update rehearsal model M GA, f a, po(ϵ) . 11 Append the selected alteration Rh(A = a) to the sequence of alterations A. |
| Open Source Code | No | The paper does not contain any explicit statement about providing source code or a link to a code repository. |
| Open Datasets | Yes | For the Bermuda data [Aglietti et al., 2020], which includes eleven variables, the goal is to maintain the net coral ecosystem calcification (NEC) within the desired range of [0.5, 2]. |
| Dataset Splits | No | The paper describes a learning process over 'seasons' and a simulated task, but does not provide specific train/test/validation dataset splits for any explicitly used dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions that the SRM is learned through Bayesian ridge regression and compares with DDPG, PPO, and SAC, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | No | The paper mentions desired ranges for outcome variables, that an SRM is learned through Bayesian ridge regression over 100 seasons, and that experiments are repeated 100 times. It also states 'More detailed experimental settings are provided in the appendix.', implying that specific hyperparameters or full system-level settings are not present in the main text. |