reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sparse Multiple Kernel Learning: Alternating Best Response and Semidefinite Relaxations

Authors: Dimitris Bertsimas, Caio de Próspero Iglesias, Nicholas A. G. Johnson

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On ten UCI benchmarks, our method with random initialization outperforms state-of-the-art MKL approaches in outof-sample prediction accuracy on average by 3.34 percentage points (relative to the best performing benchmark) while selecting a small number of candidate kernels in comparable runtime. We present rigorous numerical results across UCI benchmark datasets. Finally, in Section 5 we investigate the performance of our algorithm against benchmark methods on real world UCI datasets.
Researcher Affiliation	Academia	Dimitris Bertsimas EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA Caio de Próspero Iglesias EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA Nicholas A. G. Johnson EMAIL Massachusetts Institute of Technology Cambridge, MA 02139, USA
Pseudocode	Yes	Algorithm 1: Alternating Best Response for Sparse MKL
Open Source Code	Yes	To bridge the gap between theory and practice, we have made our code freely available at https://github.com/iglesiascaio/Sparse MKL.
Open Datasets	Yes	We evaluate all methods on ten benchmark binary classification tasks drawn from the UCI Machine Learning Repository (Dua & Graff, 2017).
Dataset Splits	Yes	To ensure reproducibility, we fixed the random seed, shuffled the rows, and split 80% of the examples into a training set and the remaining 20% into a test set.
Hardware Specification	Yes	We perform experiments on MIT s Supercloud Cluster (Reuther et al., 2018), which hosts Intel Xeon Platinum 8260 processors.
Software Dependencies	Yes	All experiments were run using Julia 1.10.1 and Python 3.10.14. Semidefinite programs were solved with MOSEK 10.1.31, while all other optimization relied on Julia s LIBSVM v0.8.0 and MKLpy 0.6.
Experiment Setup	Yes	For Algorithm 1, we perform 10-fold cross-validation over the training set to select the regularization and sparsity parameters from the grid C {5, 10, 50, 100}, λ {0.01, 0.1, 1, 10, 100}, k0 {1, 2, 3, 4, 5}.