reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Authors: Manuel Brenner, Elias Weber, Georgia Koppe, Daniel Durstewitz

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compared the performance of our framework to three other recent methods (see Appx. A.4 for details): First, as a baseline we tested an ensemble of individual sh PLRNNs, using an otherwise identical training algorithm (a kind of ablation experiment, removing specifically the hierarchical component ). Second, we employed LEarning Across Dynamical Systems (LEADS, Yin et al. (2021)), a framework that trains Neural ODEs for generalizing across DS environments by learning a shared dynamics model jointly with environment-specific models. Third, we trained context-informed dynamics adaptation (Co DA, Kirchmeyer et al. (2022)), an extension of LEADS where parameters of the combined and environment-specific models are drawn from a hypernetwork. As evidenced in Table 1, our hierarchical approach (hier-sh PLRNN) considerably outperforms all other setups. In fact, competing methods were often not even able to correctly reproduce the long-term attractor dynamics (Appx. Fig. 17), while our approach successfully recovered different attractor topologies (Figs. 2 & 21).
Researcher Affiliation	Academia	1Dept. of Theoretical Neuroscience, Central Institute of Mental Health (CIMH) 2Interdisciplinary Center for Scientific Computing, Heidelberg University 3Hector Institute for AI in Psychiatry and Dept. of Psychiatry and Psychotherapy, CIMH EMAIL
Pseudocode	No	The paper describes the training procedure and model equations (Eq. 1-24) and provides an illustration of the training algorithm in Figure 8, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code for this paper is publicly available at https://github.com/Durstewitz Lab/Hierarchical DSR.
Open Datasets	Yes	The pulse wave dataset (Charlton et al., 2019) was taken from https://peterhcharlton.github.io/pwdb/. ... The EEG recordings from epileptic patients and healthy controls were originally provided in Andrzejak et al. (2001) and reformatted by Zhang et al. (2022). The dataset is publicly available on the Time Series Classification Website (https://www.timeseriesclassification.com/description.php?Dataset=Epilepsy2) under Epilepsy2 . ... The f MRI dataset analyzed in this study was initially collected by Koppe et al. (2014), re-analyzed in Koppe et al. (2019), and made publicly accessible as part of the latter work. ... The dataset from Trindade (2015) consists of electricity load time series from 370 sites recorded between 2011 and 2014. ... UCI Machine Learning Repository, 2015. URL https://archive.ics.uci.edu/dataset/321/ electricityloaddiagrams20112014.
Dataset Splits	Yes	For the values in Table 1, we sampled single time series with 10, 000 time steps from both the ground truth system (serving as the test set) and from each trained model, inferring only the initial state from the first time step of the test set, and cutting off transients (1, 000 time steps) to focus on the long term attractor behavior. ... From these sequences, 80 (40 per class) were selected as the training set on the Time Series Classification Website, which we used for our experiments. ... We then generated new, short sequences xtest 1...Tmax with Tmax = 100, using values of ρ(j) test randomly sampled from the same interval [28, 80] that also contained the training data, and fine-tuned only a scalar new feature l(j) test on this test set.
Hardware Specification	Yes	This allowed gradient descent to converge rapidly, within 6 seconds on a single 11th Gen 2.30 GHz Intel Core i7-11800H. ... All models were trained on a single core of an Intel Xeon Gold 6254.
Software Dependencies	No	The paper mentions several software components like RAdam optimizer, scipy.integrate, lsoda solver, tsfresh, and tslearn packages, but does not provide specific version numbers for these software dependencies, nor for the main programming language or deep learning framework used.
Experiment Setup	Yes	To ensure successful training, we found it crucial to assign a significantly higher learning rate to the subject-specific feature vectors (10 3) than to the group-level matrices (10 4). ... We used Xavier uniform initialization (Glorot and Bengio, 2010) for the group-level matrices... We reduced the weights further by a factor of 0.1 to stabilize training. Additionally, we applied L2 regularization to the group-level matrices... Table 5: Hyperparameters for our models from Sect. 4.1. Benchmark Model Nfeat M L αstart αend Lorenz-63 sh PLRNN ensemble 3 30 0.2 0.02 hier-sh PLRNN 6 3 150 0.2 0.02 Lorenz-96 sh PLRNN ensemble 10 100 0.2 0.02 hier-sh PLRNN 5 10 200 0.2 0.02