On the Robustness of Kolmogorov-Arnold Networks: An Adversarial Perspective
Authors: Tal Alter, Raz Lapid, Moshe Sipper
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Across a wide range of benchmark datasets (MNIST, Fashion MNIST, KMNIST, CIFAR-10, SVHN, and a subset of Image Net), we compare KANs against conventional architectures using an extensive suite of attacks, including white-box methods (FGSM, PGD, C&W, MIM), black-box approaches (Square Attack, Sim BA, NES), and ensemble attacks (Auto Attack). Our experiments reveal that while small- and medium-scale KANs are not consistently more robust than their standard counterparts, large-scale KANs exhibit markedly enhanced resilience against adversarial perturbations. An ablation study further demonstrates that critical hyperparameters such as number of knots and spline order significantly influence robustness. Moreover, adversarial training experiments confirm the inherent safety advantages of KAN-based architectures. |
| Researcher Affiliation | Collaboration | Tal Alter EMAIL Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel Raz Lapid EMAIL Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel & Deep Keep, Tel-Aviv, Israel Moshe Sipper EMAIL Ben-Gurion University of the Negev, Beer-Sheva, 8410501, Israel |
| Pseudocode | No | The paper describes methods using mathematical formulations (e.g., equations for FGSM, PGD, C&W, MIM, Square Attack, Sim BA, NES in Appendix B) but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Across a wide range of benchmark datasets (MNIST, Fashion MNIST, KMNIST, CIFAR-10, SVHN, and a subset of Image Net), we compare KANs against conventional architectures using an extensive suite of attacks, including white-box methods (FGSM, PGD, C&W, MIM), black-box approaches (Square Attack, Sim BA, NES), and ensemble attacks (Auto Attack). Our experiments reveal that while small- and medium-scale KANs are not consistently more robust than their standard counterparts, large-scale KANs exhibit markedly enhanced resilience against adversarial perturbations. |
| Dataset Splits | Yes | The fully connected models were trained on the MNIST (Deng, 2012), Fashion MNIST (Xiao et al., 2017), and KMNIST (Prabhu, 2019) datasets with a learning rate of 1 10 4, weight decay of 5 10 4, and a batch size of 64. The convolutional models were trained on the MNIST, SVHN (Netzer et al., 2011), and CIFAR-10 (Krizhevsky, 2009) datasets using a learning rate of 1 10 4, weight decay of 1 10 4, and a batch size of 32. White-box attacks and Auto Attack were evaluated on all test sets, while black-box attacks were evaluated on a random sample of 1,000 images from the corresponding test set. Transferability metrics were calculated using 5,000 images from the test sets. |
| Hardware Specification | No | The paper mentions 'Due to memory constraints, the input resolutions varied based on model size' and 'computational constraints at our university' but does not specify any particular GPU, CPU, or other hardware model numbers used for the experiments. |
| Software Dependencies | No | The paper mentions using the Adam W optimizer, but it does not specify any software library names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | All the models were trained using the Adam W (Loshchilov & Hutter, 2017) optimizer for 20 epochs. The fully connected models were trained on the MNIST (Deng, 2012), Fashion MNIST (Xiao et al., 2017), and KMNIST (Prabhu, 2019) datasets with a learning rate of 1 10 4, weight decay of 5 10 4, and a batch size of 64. The convolutional models were trained on the MNIST, SVHN (Netzer et al., 2011), and CIFAR-10 (Krizhevsky, 2009) datasets using a learning rate of 1 10 4, weight decay of 1 10 4, and a batch size of 32. All the KAN models were configured with a uniform setting of num knots = 5 and spline order = 3. The hyperparameter configurations for these experiments are provided in Table 3. |