Analysis of generalization capacities of Neural Ordinary Differential Equations

Authors: Madhusudan Verma, Manoj Kumar

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We performed numerical experiments in section 5. In the end, some concluding remarks are given in section 6. ... The results show that as the number of hidden units increases, the generalization error also increases. This observation empirically validates Theorem 4.9...
Researcher Affiliation Academia Madhusudan Verma EMAIL School of Engineering and Science Indian Institute of Technology Madras, Zanzibar Campus Manoj Kumar EMAIL School of Engineering and Science Indian Institute of Technology Madras, Zanzibar Campus
Pseudocode No The paper includes mathematical equations, definitions, lemmas, and proofs, but no clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code.
Open Source Code Yes Code is open-source at : https://github.com/Madhusudan-Verma/Gen-bound-Node
Open Datasets Yes We conducted experiments on the MNIST and CIFAR-10 datasets to investigate the relationship between the Lipschitz constant of the dynamics function in a Neural ODE and its generalization performance.
Dataset Splits Yes The training set comprises 100 samples, while the test set includes 30 samples. ... The MNIST dataset consists of 60,000 training and 10,000 testing grayscale images... The CIFAR-10 dataset includes 50,000 training and 10,000 testing color images... The training dataset consists of 100 data points, and the validation dataset contains 20 data points.
Hardware Specification No The paper mentions that training was conducted "on a GPU if available" but does not specify any particular GPU model, CPU, memory, or other detailed hardware specifications.
Software Dependencies No The paper mentions using "Adam optimizer", "torchdiffeq.odeint solver", "torch and numpy" but does not provide specific version numbers for any of these software components.
Experiment Setup Yes For each configuration, the model is trained for 100 epochs using the Adam optimizer with a learning rate of 0.01. The loss function used is the mean squared error (MSE)... Training was performed for 10 epochs using the Adam optimizer with a learning rate of 1e-3 and cross-entropy loss. A batch size of 128 was used... Loss = MSE + λ sup 0 k N 1 Ak+1 Ak where λ is the regularization strength...