reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global curvature for second-order optimization of neural networks

Authors: Alberto Bernacchia

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the practical implications of our framework, we apply second-order optimization to synthetic data, achieving markedly faster convergence compared to traditional optimization methods.
Researcher Affiliation	Industry	1Media Tek Research, Cambridge, UK. Correspondence to: Alberto Bernacchia <EMAIL>.
Pseudocode	Yes	A detailed description of the complete procedure is provided in Algorithm 1 in the Appendix, using the simple case of a two-layer MLP with Tanh activation and no bias.
Open Source Code	Yes	Code: github.com/mtkresearch/symo notebooks
Open Datasets	No	The synthetic dataset consists of 5000 training and 5000 testing data points, where the input is sampled from a Gaussian distribution with zero mean. The covariance matrix of the input is generated using random orthogonal eigenvectors (Mezzadri, 2007), and the eigenvalues are set on a logarithmic grid between 10 5 and 100.
Dataset Splits	Yes	The synthetic dataset consists of 5000 training and 5000 testing data points, where the input is sampled from a Gaussian distribution with zero mean.
Hardware Specification	No	These are matrix-matrix products of size equal to the neural network width, that can be computed efficiently using a GPU.
Software Dependencies	No	In Pytorch for example, Assumption 2.1 holds for nn.init.normal and nn.init.orthogonal...
Experiment Setup	Yes	For all optimizers, learning rate is set by a grid search. For second-order optimizers, we additionally set a second hyperparameter by grid search: damping λ for KFAC, initialization ϵ for Shampoo and decay parameter β for Sym O.