Learning dynamics in linear recurrent neural networks

Authors: Alexandra Maria Proca, Clémentine Carla Juliette Dominé, Murray Shanahan, Pedro A. M. Mediano

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental As a final proof-of-concept, we apply our theoretical framework to explain the behavior of LRNNs performing sensory integration tasks. Our work provides a first analytical treatment of the relationship between the temporal dependencies in tasks and learning dynamics in LRNNs, building a foundation for understanding how complex dynamic behavior emerges in cognitive models. We demonstrate the generalizability of our results by applying our theoretical framework to describe the behavior of LRNNs trained on sensory integration tasks, relaxing our prior assumptions. In Appendix P, Simulations are detailed, providing empirical validation of the theoretical claims.
Researcher Affiliation Academia 1Department of Computing, Imperial College London, London, United Kingdom 2Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom 3Division of Psychology and Language Sciences, University College London, London, United Kingdom. Correspondence to: Alexandra M. Proca <EMAIL>.
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. Mathematical derivations and model descriptions are presented in standard text and equation format.
Open Source Code Yes Code for all simulations can be found at https://github.com/aproca/LRNN_dynamics
Open Datasets Yes To create the sensory-integration tasks, we use the multisensory integration task from Neurogym (Molano-Maz on et al., 2022) to generate stimuli in four input dimensions (removing the fixation input).
Dataset Splits No The paper describes how stimuli for sensory integration tasks were generated but does not specify how this data was divided into training, validation, or test sets for experimental reproduction.
Hardware Specification No No specific hardware details such as GPU models, CPU models, or other computing resources used for running the experiments are mentioned in the paper.
Software Dependencies No No specific software dependencies, libraries, or programming language versions (e.g., Python 3.8, PyTorch 1.9) are mentioned in the paper.
Experiment Setup No Networks are trained using gradient descent on the mean squared error and automatic differentiation in order to validate our theoretical results. We modify our learning timescale to account for the additional scalars introduced by taking the mean over P samples the and output dimension Ny (τ = PNy/η, where η is the learning rate) when comparing to simulation. The paper does not provide concrete numerical values for hyperparameters such as the learning rate (η or τ), batch size, or number of epochs.