Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios
Authors: Hongwei Wen, Annika Betken, Hanyuan Hang
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS In this section, we evaluate the CCP-based methods on three classification datasets: the Dionis and Satimage datasets from the Open ML Science Platform (Vanschoren et al., 2014), and the Gas Sensor dataset from the UCI ML Repository (Kelly et al., 2007). A detailed description of the data generation process and the hyperparameter grids for KLR is provided in Appendix F. Validation of Effectiveness. The CCP-based estimator bq(y|x) in Eq. (3), tailored for complex classification tasks on distribution Q, is referred to as the CCP method. Similarly, the estimator bp(y|x) in Eq. (10), designed for standard classification on data from P, is referred to as the baseline method. To validate the effectiveness of the CCP-based method in complex classification tasks, we compare its performance against that of the baseline method. Table 1 summarizes the label prediction accuracy of the CCP-based estimator bq(y|x) in Eq. (3) and the baseline estimator bp(y|x) in Eq. (10) on the test data Du t drawn from distribution Q, evaluated across three complex classification scenarios. |
| Researcher Affiliation | Collaboration | Hongwei Wen & Annika Betken Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente Enschede, The Netherlands EMAIL Hanyuan Hang Hong Kong Research Institute Contemporary Amperex Technology (Hong Kong) Limited Hong Kong Science Park, New Territories, Hong Kong EMAIL |
| Pseudocode | No | The paper describes algorithms and methods using mathematical equations and textual descriptions, but it does not contain any clearly labeled pseudocode blocks or algorithm figures. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | In this section, we evaluate the CCP-based methods on three classification datasets: the Dionis and Satimage datasets from the Open ML Science Platform (Vanschoren et al., 2014), and the Gas Sensor dataset from the UCI ML Repository (Kelly et al., 2007). |
| Dataset Splits | Yes | F EXPERIMENTAL DETAILS Using the benchmark datasets, we construct the labeled data Dp and the unlabeled test data Du t := (X(t) i )nt i=1 from the distribution Q to evaluate performance. Additionally, we construct Du q for label shift adaptation and Ds for transfer learning. To achieve this, we first generate label distributions using either a uniform or Dirichlet distribution. Based on these distributions, we randomly resample data from the original benchmark datasets to create the required datasets. Specifically, we generate 10 labeled datasets Dp, and for each Dp, we create 10 test datasets Du t and 10 additional datasets (e.g., Du q or Ds) using different random seeds, resulting in a total of 100 repeated experiments. Table 2: Data Descriptions in Three Complex Classification Scenarios Dataset n d M Long-tailed Domain Adaptation Transfer Learning np nt np nq nt np ns nt Dionis 416188 61 355 14200 7100 14200 7100 14200 14200 5000 7100 Gas Sensor 13910 128 6 3000 1000 3000 1000 3000 3000 1000 1000 Satimage 6430 36 6 1500 1000 2400 1200 1200 1500 1000 1000 |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | When fitting the KLR model to the source domain Dp, we select two key hyperparameters: the regularization parameter C and the kernel coefficient γ. These are determined using 5-fold cross-validation. Unlike the commonly used classification loss, we use the CE loss as the criterion for cross-validation, as our primary objective is to accurately estimate p(y|x). Specifically, C is chosen from seven values evenly spaced on a logarithmic scale between 10 6 and 100, while γ is selected from seven values evenly spaced on a logarithmic scale between 2 6 and 20. |