Joker: Joint Optimization Framework for Lightweight Kernel Machines

Authors: Junhong Zhang, Zhihui Lai

ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that Joker saves up to 90% memory but achieves comparable training time and performance (or even better) than the state-of-the-art methods. Table 4 shows the result of the performance comparison.
Researcher Affiliation Academia 1College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China 2Guangdong Provincial Key Laboratory of Intelligent Information Processing, Shenzhen, 518060, China. Correspondence to: Zhihui Lai <lai zhi EMAIL>.
Pseudocode Yes Algorithm 1: Trust region (twice-differentialable f), Algorithm 2: Truncated CG-Steihaug, Algorithm 3: DBCD-TR for problem (4)
Open Source Code Yes The implementation2 of Joker is based on PyTorch without extra acceleration libraries. 2Code available at GitHub: https://github.com/Apple-Zhang/Joker-paper.
Open Datasets Yes Million-song dataset (MSD) (Bertin-Mahieux et al., 2011). This dataset contains audio features for year prediction. Available at https://archive.ics.uci.edu/dataset/203/yearpredictionmsd. Household Electric Power Consumption (HEPC) dataset: It is the same as the House Elec dataset used in (Lin et al., 2024), available using the python package https://github.com/treforevans/uci_datasets. Supersymmetric particle classification (SUSY) dataset (Baldi et al., 2014): This is a binary classification task to distinguish between supersymmetric particles and background process. Available at https://archive.ics. uci.edu/dataset/279/susy. HIGGS dataset (Baldi et al., 2014): This is a binary classification task to distinguish between the Higgs boson and the background process. Available at https://archive.ics.uci.edu/dataset/280/higgs. CIFAR-5M dataset (Nakkiran et al., 2021): This is a generated dataset based on CIFAR-10. Available at https: //github.com/preetum/cifar5m.
Dataset Splits Yes Table A.1. Summary of datasets used in experiments Data Split ... MSD 90% train, 10% test ... HEPC 90% train, 10% test ... SUSY 80% train, 20% test ... HIGGS 80% train, 20% test ... CIFAR-5M 80% train, 20% test
Hardware Specification Yes We implemented KRR, KLR, SVM, etc. based on Joker, and conducted experiments with a single RTX 3080 (10GB). To highlight that Joker can obtain promising performance under a limited computational budget, we conduct experiments on a machine with a single consumer GPU (NVIDIA RTX 3080, 10GB) and 64GB RAM.
Software Dependencies No The implementation2 of Joker is based on PyTorch without extra acceleration libraries. Explanation: The paper mentions PyTorch as a dependency but does not specify its version number or versions for any other key software components, which is required for reproducibility.
Experiment Setup Yes The regularization parameter λ in Joker is tuned from {2i : i = 7, 6, ..., 7} via grid search. ... In most cases, we employ the inexact Joker models with the block size |B| = 512 ... We also increase the block size to 1024 for Joker-KLR to accelerate convergence. ... Details of further parameter settings are shown in Appendix C. ... Table A.3. The major hyperparameters of the models in the experiments.