Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks

Authors: Xianyang Zhang, Huijuan Zhou

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These bounds are empirically investigated for KANs trained with stochastic gradient descent on simulated and real data sets. The numerical results demonstrate the practical relevance of these bounds.
Researcher Affiliation Academia Xianyang Zhang Department of Statistics Texas A&M University College Station, TX 77843, USA EMAIL; Huijuan Zhou School of Statistics and Data Science Shanghai University of Finance and Economics Shanghai, China EMAIL
Pseudocode No The paper describes methods mathematically and in text but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No The paper does not provide any concrete statement about releasing source code for the methodology described, nor does it include links to any code repositories.
Open Datasets Yes We also investigated the MNIST and CIFAR-10 datasets.
Dataset Splits Yes We set the sample sizes of both the training set and test set to be 10,000 for all four datasets.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions the use of 'Si LU function and basis splines' in the implemented KANs but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We run 1000 epochs for each dataset. ... We apply the dropout technique with a rate of 0.1 to the activation functions in KAN networks (referred to as regularized KAN) with the same shapes as we have used in Section 3 (i.e., [4, 50, 100, 50, 1] for both setups).