Deep Learning Alternatives Of The Kolmogorov Superposition Theorem
Authors: Leonardo Ferreira Guilhoto, Paris Perdikaris
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Act Net in the context of Physics-Informed Neural Networks (PINNs), a framework well-suited for leveraging KST’s strengths in low-dimensional function approximation, particularly for simulating partial differential equations (PDEs). In this challenging setting, where models must learn latent functions without direct measurements, Act Net consistently outperforms KANs across multiple benchmarks and is competitive against the current best MLP-based approaches. |
| Researcher Affiliation | Academia | Leonardo Ferreira Guilhoto Graduate Group in Applied Mathematics and Computational Science University of Pennsylvania Philadelphia, PA 19104, USA EMAIL Paris Perdikaris Department of Mechanical Engineering & Applied Mechanics University of Pennsylvania Philadelphia, PA 19104, USA EMAIL |
| Pseudocode | No | The paper describes the architecture and mathematical formulations in detail but does not present a distinct pseudocode or algorithm block. |
| Open Source Code | Yes | Our entire code-base is publicly available online on Git Hub. The code used to carry out the ablation experiments can be found at the Git Hub repository https: //github.com/Predictive Intelligence Lab/Act Net. Our Act Net implementation is also available in the open-source Jax Pi framework at https://github.com/ Predictive Intelligence Lab/jaxpi/tree/Act Net, which we used to carry out the comparisons against current state-of-the-art. |
| Open Datasets | No | The paper focuses on Physics-Informed Neural Networks (PINNs) which learn solutions to Partial Differential Equations (PDEs) by minimizing residuals, rather than training on pre-existing external datasets. The problems (Poisson, Helmholtz, Allen-Cahn, Advection, Kuramoto Sivashinsky equations) are defined within the paper with initial conditions and forcing terms, and data points are sampled from the domain during training, not sourced from a public dataset. |
| Dataset Splits | No | The paper discusses Physics-Informed Neural Networks (PINNs), which typically do not rely on pre-defined training/test/validation splits of external datasets. Instead, data points are sampled from the problem domain during training. For example, it mentions using 'a batch size of 5,000 points uniformly sampled at random on [0, 1]2 at each step' for the Poisson and Helmholtz equations, which describes a sampling strategy rather than a dataset split. |
| Hardware Specification | Yes | Table 7: Average computational time per Adam training iteration for the Poisson problem on a Nvidia RTX A6000 GPU. Table 8: Average computational time per Adam training iteration for the Helmholtz problem on a Nvidia RTX A6000 GPU. Table 9: Average computational time per Adam training iteration for the Allen-Cahn problem on a Nvidia RTX A6000 GPU. |
| Software Dependencies | Yes | We would also like to acknowledge support from the US Department of Energy under the Advanced Scientific Computing Research program (grant DE-SC0024563) and the US National Science Foundation (NSF) Soft AE Research Traineeship (NRT) Program (NSF grant 2152205). We also thank Maryl Harris for helpful feedback when reviewing the writing of the manuscript and the developers of software that enabled this research, including JAX (Bradbury et al., 2018), Flax (Heek et al., 2023) Matplotlib (Hunter, 2007) and Num Py (Harris et al., 2020). |
| Experiment Setup | Yes | Each model was trained using Adam (Kingma & Ba, 2017) for 30,000 iterations, then fine tuned using LBFGS Liu & Nocedal (1989) for 100 iterations. We use a batch size of 5,000 points uniformly sampled at random on [0, 1]2 at each step. For training using Adam, we use learning rate warmup from 10-7 to 5*10-3 over 1,000 iterations, then exponential decay with rate 0.75 every 1,000 steps, and adaptive gradient clipping with parameter 0.01 as described in Brock et al. (2021). For Act Net, we use {8, 16, 32} values of N and for KAN grid resolutions of {3, 10, 30}. For Siren we consider ω0 in { πw/3, πw, 3πw}. For MLP we consider activations in {tanh, sigmoid, GELU}. |