Global Convergence Rate of Deep Equilibrium Models with General Activations

Authors: Lan V. Truong

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This compelling result is further supported by our numerical experiments on the MNIST and CIFAR-10 datasets.
Researcher Affiliation Academia Lan V. Truong EMAIL School of Mathematics, Statistics and Actuarial Science University of Essex
Pseudocode Yes A weight initialisation algorithm (WIALG) is as follows. Initialise: m = n, σ2 w = 1 96L2 . Generate a matrix W Rm m where Wij N 0, 2σ2 w m . Generate a matrix U Rm d where Uij N 0, 2 m . Generate a vector a Rm where ai N 0, 1 m . Find a fixed-point T of the equation T = φ(WT+UX) by using Anderson acceleration method Walker & Ni (2011).
Open Source Code No The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository.
Open Datasets Yes In this section, we conduct experiments to validate Theorem 3. Specifically, we evaluate the performance of the DEQ model on the MNIST and CIFAR-10 datasets.
Dataset Splits No The paper mentions using MNIST and CIFAR-10 datasets and normalizing data points, but does not specify training, validation, or test splits, or any methodology for creating them.
Hardware Specification No The paper does not specify any hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers used for the experiments.
Experiment Setup No While the paper describes varying parameters such as 'm' and activation functions, it lacks specific details regarding hyperparameters for the numerical experiments, such as the exact learning rate used, batch size, optimizer, or number of epochs.