Sparse Multiple Kernel Learning: Alternating Best Response and Semidefinite Relaxations

Authors: Dimitris Bertsimas, Caio de Próspero Iglesias, Nicholas A. G. Johnson

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On ten UCI benchmarks, our method with random initialization outperforms state-of-the-art MKL approaches in outof-sample prediction accuracy on average by 3.34 percentage points (relative to the best performing benchmark) while selecting a small number of candidate kernels in comparable runtime. We present rigorous numerical results across UCI benchmark datasets. Finally, in Section 5 we investigate the performance of our algorithm against benchmark methods on real world UCI datasets.
Researcher Affiliation Academia Dimitris Bertsimas EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA Caio de Próspero Iglesias EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA Nicholas A. G. Johnson EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA
Pseudocode Yes Algorithm 1: Alternating Best Response for Sparse MKL
Open Source Code Yes To bridge the gap between theory and practice, we have made our code freely available at https://github.com/iglesiascaio/Sparse MKL.
Open Datasets Yes We evaluate all methods on ten benchmark binary classification tasks drawn from the UCI Machine Learning Repository (Dua & Graff, 2017).
Dataset Splits Yes To ensure reproducibility, we fixed the random seed, shuffled the rows, and split 80% of the examples into a training set and the remaining 20% into a test set.
Hardware Specification Yes We perform experiments on MIT s Supercloud Cluster (Reuther et al., 2018), which hosts Intel Xeon Platinum 8260 processors.
Software Dependencies Yes All experiments were run using Julia 1.10.1 and Python 3.10.14. Semidefinite programs were solved with MOSEK 10.1.31, while all other optimization relied on Julia s LIBSVM v0.8.0 and MKLpy 0.6.
Experiment Setup Yes For Algorithm 1, we perform 10-fold cross-validation over the training set to select the regularization and sparsity parameters from the grid C {5, 10, 50, 100}, λ {0.01, 0.1, 1, 10, 100}, k0 {1, 2, 3, 4, 5}.