reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Global Convergence in Neural ODEs: Impact of Activation Functions

Authors: Tianxiang Gao, Siyuan Sun, Hailiang Liu, Hongyang Gao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical findings are validated by numerical experiments, which not only support our analysis but also provide practical guidelines for scaling Neural ODEs, potentially leading to faster training and improved performance in real-world applications.
Researcher Affiliation	Academia	Tianxiang Gao De Paul University EMAIL Siyuan Sun Iowa State University EMAIL Hailiang Liu Iowa State University EMAIL Hongyang Gao Iowa State University EMAIL
Pseudocode	Yes	Algorithm 1 Res Net f L θ Forward Computation on Input x and Algorithm 2 Res Net f L θ Forward and Backward Computation on Input x
Open Source Code	No	The paper does not provide an explicit statement about releasing code or a link to a source code repository for the methodology described.
Open Datasets	Yes	Both the Neural ODE and Res Net were initialized with the same random weights and evaluated on the MNIST dataset, with Res Net depths L ranging from 10 to 1,000. We used Softplus activation to ensure smoothness. Additionally, we also include convergence analysis on diverse datasets, such as CIFAR-10, AG News, and Daily Climate, as well as additional activations like GELU, further demonstrating the generalizability of our findings.
Dataset Splits	No	The paper mentions using 'MNIST dataset', 'CIFAR-10', 'AG News', and 'Daily Climate' datasets, and refers to 'training set' and 'test loss'. For instance, in Section 6, it states 'the number of training samples (i.e., which is 1000 in our experiments)' and 'By sampling 500 examples from the MNIST training set' in Section H.2. However, it does not provide specific percentages or absolute sample counts for training, validation, and test splits, nor does it explicitly reference standard splits with citations that define these details.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or cloud computing specifications.
Software Dependencies	No	The paper mentions ODE solvers like 'Euler, rk4, and dopri5' in Section H.7, but it does not specify their version numbers or any other software dependencies with their versions.
Experiment Setup	Yes	We evaluated Neural ODE models with increasing widths, ranging from 10 to 1,000, and computed the NTK for each width. We monitored both the NTK s smallest eigenvalue and the distance of the model parameters from their initial values over 100 epochs. Softplus was used as the activation function to ensure smoothness and non-polynomial nonlinearity. Additionally, in Section H.5, it states 'The optimizer used was gradient descent with a learning rate of 0.1, and models were trained for 100 epochs' for experiments on diverse datasets with 'different widths (i.e., 500, 1000, 2000, 3000)'.