Global Convergence Rate of Deep Equilibrium Models with General Activations
Authors: Lan V. Truong
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This compelling result is further supported by our numerical experiments on the MNIST and CIFAR-10 datasets. |
| Researcher Affiliation | Academia | Lan V. Truong EMAIL School of Mathematics, Statistics and Actuarial Science University of Essex |
| Pseudocode | Yes | A weight initialisation algorithm (WIALG) is as follows. Initialise: m = n, σ2 w = 1 96L2 . Generate a matrix W Rm m where Wij N 0, 2σ2 w m . Generate a matrix U Rm d where Uij N 0, 2 m . Generate a vector a Rm where ai N 0, 1 m . Find a fixed-point T of the equation T = φ(WT+UX) by using Anderson acceleration method Walker & Ni (2011). |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing the code or a link to a code repository. |
| Open Datasets | Yes | In this section, we conduct experiments to validate Theorem 3. Specifically, we evaluate the performance of the DEQ model on the MNIST and CIFAR-10 datasets. |
| Dataset Splits | No | The paper mentions using MNIST and CIFAR-10 datasets and normalizing data points, but does not specify training, validation, or test splits, or any methodology for creating them. |
| Hardware Specification | No | The paper does not specify any hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers used for the experiments. |
| Experiment Setup | No | While the paper describes varying parameters such as 'm' and activation functions, it lacks specific details regarding hyperparameters for the numerical experiments, such as the exact learning rate used, batch size, optimizer, or number of epochs. |