KAN: Kolmogorov–Arnold Networks
Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Hou, Max Tegmark
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. Moreover, KANs are shown to be more accurate and have faster scaling laws than MLPs in function fitting and PDE solving, both theoretically and empirically. |
| Researcher Affiliation | Academia | 1 Massachusetts Institute of Technology 2 California Institute of Technology 3 Northeastern University 4 The NSF Institute for Artificial Intelligence and Fundamental Interactions |
| Pseudocode | No | The paper describes methods textually and with equations, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions PyTorch as the framework used for building codes but does not explicitly state that the code for this specific work is open-sourced or provide a repository link. |
| Open Datasets | Yes | The paper mentions using specific public datasets and benchmarks: "PDEBench Takamoto et al. (2022)", "MNIST", "Feynman datasets Udrescu & Tegmark (2020); Udrescu et al. (2020)". |
| Dataset Splits | Yes | The paper provides specific dataset split information, for example: "randomly generated 1000 training and test samples from U[-1, 1]^2" for toy function fitting, and for MNIST: "The whole training dataset (60000) and test dataset (10000) are used to evaluate train/test loss/acc." |
| Hardware Specification | Yes | All models are trained with the Adam Optimizer for 15000 steps with learning rate decay (5000 steps for learning rate 10^-3, 10^-4 and 10^-5), with batch size 1024, on a V100 GPU. |
| Software Dependencies | No | The paper mentions "Codes are built based on pytorch Paszke et al. (2019)" and "Sympy is used to compute the symbolic formula", but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | All models are trained with the Adam Optimizer for 15000 steps with learning rate decay (5000 steps for learning rate 10^-3, 10^-4 and 10^-5), with batch size 1024, on a V100 GPU. For PDE solving, specific parameters are mentioned: "Adam optimizers with a learning rate 10^-3 for 1000 steps except for 10000 steps for MLP (10x training)." and "α = 0.01" for loss balancing. |