Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?
Authors: Francesco Innocenti, El Mehdi Achour, Ryan Singh, Christopher L Buckley
NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on both linear and non-linear networks strongly validate our theory and further suggest that all the saddles of the equilibrated energy are strict. |
| Researcher Affiliation | Academia | Francesco Innocenti School of Engineering and Informatics University of Sussex EMAIL El Mehdi Achour RWTH Aachen University Aachen, Germany EMAIL Ryan Singh School of Engineering and Informatics University of Sussex EMAIL Christopher L. Buckley School of Engineering and Informatics University of Sussex VERSES EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to reproduce all the experiments is available at https://github.com/francesco-innocenti/pc-saddles. |
| Open Datasets | Yes | We trained DLNs with different number of hidden layers H {2, 5, 10} on standard image classification datasets (MNIST, Fashion-MNIST and CIFAR10). |
| Dataset Splits | No | The paper mentions training networks and observing training loss dynamics but does not explicitly provide information on train/validation/test splits, proportions, or specific methods for data partitioning. |
| Hardware Specification | No | The paper's NeurIPS checklist states: "Most experimental results can be reproduced in a few hours on a CPU, with the exception of those related to Figures 5 & 12 which were run on a GPU (typically A100)." This is not a specific hardware specification for all experiments. |
| Software Dependencies | No | The paper mentions using "standard Euler integration" and a "second-order explicit Runge Kutta ODE solver (Heun)" but does not list specific software libraries or frameworks with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | The following hyperparameters were used for all networks: 300 hidden units and SGD with learning rate η = 1e 3 and batch size b = 64. We used a second-order explicit Runge Kutta ODE solver (Heun) with a maximum upper integration limit T = 300 and an adaptive Proportional-Integral-Derivative controller (absolute and relative tolerances: 1e 3) to ensure convergence of the PC inference dynamics (Eq. 3). All networks were initialised close to the origin Wij N(0, σ2) with σ = 5e 3. |