Learning dynamics in linear recurrent neural networks
Authors: Alexandra Maria Proca, Clémentine Carla Juliette Dominé, Murray Shanahan, Pedro A. M. Mediano
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As a final proof-of-concept, we apply our theoretical framework to explain the behavior of LRNNs performing sensory integration tasks. Our work provides a first analytical treatment of the relationship between the temporal dependencies in tasks and learning dynamics in LRNNs, building a foundation for understanding how complex dynamic behavior emerges in cognitive models. We demonstrate the generalizability of our results by applying our theoretical framework to describe the behavior of LRNNs trained on sensory integration tasks, relaxing our prior assumptions. In Appendix P, Simulations are detailed, providing empirical validation of the theoretical claims. |
| Researcher Affiliation | Academia | 1Department of Computing, Imperial College London, London, United Kingdom 2Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom 3Division of Psychology and Language Sciences, University College London, London, United Kingdom. Correspondence to: Alexandra M. Proca <EMAIL>. |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. Mathematical derivations and model descriptions are presented in standard text and equation format. |
| Open Source Code | Yes | Code for all simulations can be found at https://github.com/aproca/LRNN_dynamics |
| Open Datasets | Yes | To create the sensory-integration tasks, we use the multisensory integration task from Neurogym (Molano-Maz on et al., 2022) to generate stimuli in four input dimensions (removing the fixation input). |
| Dataset Splits | No | The paper describes how stimuli for sensory integration tasks were generated but does not specify how this data was divided into training, validation, or test sets for experimental reproduction. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU models, or other computing resources used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | No specific software dependencies, libraries, or programming language versions (e.g., Python 3.8, PyTorch 1.9) are mentioned in the paper. |
| Experiment Setup | No | Networks are trained using gradient descent on the mean squared error and automatic differentiation in order to validate our theoretical results. We modify our learning timescale to account for the additional scalars introduced by taking the mean over P samples the and output dimension Ny (τ = PNy/η, where η is the learning rate) when comparing to simulation. The paper does not provide concrete numerical values for hyperparameters such as the learning rate (η or τ), batch size, or number of epochs. |