Geometrically Inspired Kernel Machines for Collaborative Learning Beyond Gradient Descent

Authors: Mohit Kumar, Alexander Valentinitsch, Magdalena Fuchs, Mathias Brucker , Juliana Bowles, Adnan Husakovic, Ali Abbas, Bernhard A. Moser

JAIR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments Existing research [26] has shown not only the privacy-preserving potential of KAHMs (by fabricating privacypreserving data) but also that they remain computationally practical, given that they are capable of addressing computational challenges of big data (as stated in Remark 12). Thus, experiments to establish the privacypreserving property and computational efficiency are not repeated here. However, Assumption 1 made here to derive a learning solution (cf. Theorem 4.1) still needs to be validated through experiments. Consequently, we have conducted experiments to this effect (see Section 5.1). Further, the competitive advantage in terms of performance of a KAHM-based approach to collaborative learning in a federated setting still needs to be established by comparing it to the state of the art methods. Federated learning experiments are provided in Section 5.2, followed by experiments of knowledge transfer across clients in Section 5.3. Finally, in Section 5.4, we show the effectiveness of our method in the single-class data scenario.
Researcher Affiliation Collaboration MOHIT KUMAR , University of Rostock, Germany and Software Competence Center Hagenberg Gmb H, Austria ALEXANDER VALENTINITSCH, Software Competence Center Hagenberg Gmb H, Austria MAGDALENA FUCHS, ETH ZΓΌrich, Switzerland MATHIAS BRUCKER, Software Competence Center Hagenberg Gmb H, Austria JULIANA BOWLES, University of St Andrews, UK ADNAN HUSAKOVIC, Primetals Technologies Austria Gmb H, Austria ALI ABBAS, Primetals Technologies Austria Gmb H, Austria BERNHARD A. MOSER, Johannes Kepler University, Austria and Software Competence Center Hagenberg Gmb H, Austria
Pseudocode Yes B Practical Choice for KAHM Subspace Dimension Given 𝑁number of samples, the subspace dimension 𝑛can not exceed data dimension 𝑝and 𝑁 1 (as the number of principal components with non-zero variance cannot exceed 𝑁 1). Further, 𝑛should not be too high to cause negligible variance along any of the principal components. This can be ensured by checking data variance along each principal component, and if needed decrementing 𝑛by 1 till required. Following algorithm is suggested to practically determine 𝑛: Require: Dataset {𝑦𝑖 R𝑝}𝑁 𝑖=1. 1: 𝑛 min(20, 𝑝, 𝑁 1). 2: Define P R𝑛 𝑝such that the 𝑖-th row of P is equal to transpose of eigenvector corresponding to 𝑖-th largest eigenvalue of sample covariance matrix of samples {𝑦1, ,𝑦𝑁}. 3: Define π‘₯𝑖= P𝑦𝑖, 𝑖 {1, 2, , 𝑁}. 4: while min1 𝑗 𝑛 max1 𝑖 𝑁(π‘₯𝑖)𝑗 min1 𝑖 𝑁(π‘₯𝑖)𝑗 < 1e 3 do 5: 𝑛 𝑛 1. 6: Define P R𝑛 𝑝such that the 𝑖-th row of P is equal to transpose of eigenvector corresponding to 𝑖-th largest eigenvalue of sample covariance matrix of dataset {𝑦1, ,𝑦𝑁}. 7: Define π‘₯𝑖= P𝑦𝑖, 𝑖 {1, 2, , 𝑁}. 8: end while 9: return 𝑛
Open Source Code Yes Implementation. The method was implemented using MATLAB (R2024a) and the source code was made publicly available on https://github.com/software-competence-center-hagenberg/GIKM.
Open Datasets Yes Datasets. We study the multi-class classification problem on the benchmark datasets including MNIST, Freiburg Groceries, Fashion MNIST, CIFAR-10, CIFAR-100, and Office-Caltech-10 datasets. The MNIST dataset contains 28 28 sized images (of digits) divided into 60000 training images and 10000 test images. The Freiburg Groceries dataset [14] has 4947 images of grocery products (commonly sold in Germany) labelled into 25 different categories and divided into 3929 training images and 1018 test images. The Fashion MNIST dataset contains 60000 training and 10000 test 28 28 grayscale images of fashion products from 10 categories. The CIFAR-10 dataset contains 50000 training and 10000 test 32 32 color images from 10 different classes. The CIFAR-100 dataset consists of 60000 32 32 color images from 100 classes with 500 training images and 100 test images in each class. The Office-Caltech-10 dataset, containing the 10 overlapping categories between the Office dataset and Caltech256 dataset, consists of images coming from four data sources: Amazon (958 images), Caltech (1123 images), DSLR (157 images), and Webcam (295 images).
Dataset Splits Yes The MNIST dataset contains 28 28 sized images (of digits) divided into 60000 training images and 10000 test images. The Freiburg Groceries dataset [14] has 4947 images of grocery products (commonly sold in Germany) labelled into 25 different categories and divided into 3929 training images and 1018 test images. The Fashion MNIST dataset contains 60000 training and 10000 test 28 28 grayscale images of fashion products from 10 categories. The CIFAR-10 dataset contains 50000 training and 10000 test 32 32 color images from 10 different classes. The CIFAR-100 dataset consists of 60000 32 32 color images from 100 classes with 500 training images and 100 test images in each class. ... Following [47], we consider a non-iid label skew 20% (or 30%) federated learning setting, in which the number of clients is equal to 100; each client is first randomly assigned 20% (or 30%) of the total available class-labels in a dataset, and then the training samples of each class are randomly distributed equally among clients who have been assigned that class; all of the test samples of a class are assigned to every client who has been assigned that class; the accuracy over the client s test data, averaged across clients, is calculated to evaluate the performance. ... the number of training samples per class in the source domain is 20 for Amazon and is 8 for Caltech, DSLR, and Webcam; the number of labelled samples per class in the target domain is 3 for all the four domains; 20 random train/test splits are created and the performance on target domain test samples is averaged over 20 experiments. ... As in the previous studies on this dataset [24, 22, 28], the experiments are performed for 5 different train-test splits of data.
Hardware Specification Yes Implementation. The method was implemented using MATLAB (R2024a) and the source code was made publicly available on https://github.com/software-competence-center-hagenberg/GIKM. The experiments were performed on an i Mac (M1, 2021) machine with 8 GB RAM.
Software Dependencies Yes Implementation. The method was implemented using MATLAB (R2024a) and the source code was made publicly available on https://github.com/software-competence-center-hagenberg/GIKM.
Experiment Setup Yes Data Processing. KAHMs are built from training data samples. Since our experiments are on the images, a feature vector needs to be extracted from each image so that the client and class specific KAHMs could be built from the extracted feature vectors. For MNIST and Fashion MNIST datasets, 28 28 pixel values of each image are divided by 255 (to scale the values in the range from 0 to 1) and flattened to an equivalent 784 dimensional data point. For Freiburg Groceries, CIFAR-10, CIFAR-100, and Office-Caltech-10 datasets, the existing Res Net-50 neural network is employed as feature extractor by using the activations of avg_pool layer (i.e. the last average pooling layer just before the fully connected layer) as features, resulting into a 2048 dimensional data point for each image. For Office-Caltech10 images in transfer learning experiments, additionally a 4096-dimensional data point is computed from the activations of fc6 layer of the existing VGG-16 neural network to compare the results with previous studies using same features. Finally, for all the datasets, the hyperbolic tangent function operates along each dimension of a data point to constrain the values between -1 and +1, resulting in the feature vectors to be considered for classification. It is worth mentioning that the proposed method does not involve any free parameters to be tuned, since building a KAHM requires only selecting the subspace dimension 𝑛 𝑝, which is determined as stated in Appendix B.