X-KAN: Optimizing Local Kolmogorov-Arnold Networks via Evolutionary Rule-Based Machine Learning

Authors: Hiroki Shiraishi, Hisao Ishibuchi, Masaya Nakata

IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experimental results on artificial test functions and real-world datasets demonstrate that X-KAN significantly outperforms conventional methods, including XCSF, Multi-Layer Perceptron, and KAN, in terms of approximation accuracy.
Researcher Affiliation Academia 1Faculty of Engineering, Yokohama National University 2Department of Computer Science and Engineering, Southern University of Science and Technology EMAIL, EMAIL, EMAIL
Pseudocode Yes Algorithm 1 presents our algorithm for X-KAN training.
Open Source Code Yes Our X-KAN implementation and an extended version of this paper, including appendices, are available at https://doi.org/10.48550/ar Xiv.2505.14273.
Open Datasets Yes We evaluate X-KAN s performance on eight function approximation problems: four test functions shown in Fig. 2 from [Stein et al., 2018] and four real-world datasets from [Heider et al., 2023]. For details of these problems, kindly refer to Appendices F and G.
Dataset Splits Yes Performance evaluation uses Mean Absolute Error (MAE) on test data over 30 trials of Monte Carlo cross-validation, with 90% training and 10% testing data splits.
Hardware Specification Yes Fig. 4 shows the average runtime per trial under an experimental environment running Ubuntu 24.04.1 LTS with an Intel Core i9-13900F CPU (5.60 GHz) and 32GB RAM.
Software Dependencies No While MLP, KAN, and local KAN models in X-KAN were implemented in Python by the original KAN authors, the XCSF and X-KAN frameworks were implemented in Julia [Bezanson et al., 2017] by the authors. X-KAN calls Python-based local KAN models from its Julia framework.
Experiment Setup Yes The hyperparameters for XCSF and X-KAN are set to r0 = 1.0, P# {0.0 (test functions), 0.8 (real-world datasets)}, ̘̄0 = 0.02, ̘̄ = 0.2, ̘EA = 100, τ = 0.4, χ = 0.8, μ = 0.04, m0 = 0.1, and N = 50. The maximum number of training iterations for XCSF and X-KAN is 10 epochs. The same architecture in Eq. (5) is used for KAN and each rule in XKAN, which consists of three layers with 2n + 1 nodes in the hidden layer, where G = 3 and K = 3 for B-spline parameters. The three-layer MLP architecture in Eq. (3) with H hidden nodes is used together with Si LU activation functions. ... All network hyperparameters follow the original KAN authors implementation3, with training conducted for 10 epochs. Input features are normalized to [0, 1], and data targets are normalized to [-1, 1].