reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios

Authors: Hongwei Wen, Annika Betken, Hanyuan Hang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS In this section, we evaluate the CCP-based methods on three classification datasets: the Dionis and Satimage datasets from the Open ML Science Platform (Vanschoren et al., 2014), and the Gas Sensor dataset from the UCI ML Repository (Kelly et al., 2007). A detailed description of the data generation process and the hyperparameter grids for KLR is provided in Appendix F. Validation of Effectiveness. The CCP-based estimator bq(y\|x) in Eq. (3), tailored for complex classification tasks on distribution Q, is referred to as the CCP method. Similarly, the estimator bp(y\|x) in Eq. (10), designed for standard classification on data from P, is referred to as the baseline method. To validate the effectiveness of the CCP-based method in complex classification tasks, we compare its performance against that of the baseline method. Table 1 summarizes the label prediction accuracy of the CCP-based estimator bq(y\|x) in Eq. (3) and the baseline estimator bp(y\|x) in Eq. (10) on the test data Du t drawn from distribution Q, evaluated across three complex classification scenarios.
Researcher Affiliation	Collaboration	Hongwei Wen & Annika Betken Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente Enschede, The Netherlands EMAIL Hanyuan Hang Hong Kong Research Institute Contemporary Amperex Technology (Hong Kong) Limited Hong Kong Science Park, New Territories, Hong Kong EMAIL
Pseudocode	No	The paper describes algorithms and methods using mathematical equations and textual descriptions, but it does not contain any clearly labeled pseudocode blocks or algorithm figures.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or provide a link to a code repository.
Open Datasets	Yes	In this section, we evaluate the CCP-based methods on three classification datasets: the Dionis and Satimage datasets from the Open ML Science Platform (Vanschoren et al., 2014), and the Gas Sensor dataset from the UCI ML Repository (Kelly et al., 2007).
Dataset Splits	Yes	F EXPERIMENTAL DETAILS Using the benchmark datasets, we construct the labeled data Dp and the unlabeled test data Du t := (X(t) i )nt i=1 from the distribution Q to evaluate performance. Additionally, we construct Du q for label shift adaptation and Ds for transfer learning. To achieve this, we first generate label distributions using either a uniform or Dirichlet distribution. Based on these distributions, we randomly resample data from the original benchmark datasets to create the required datasets. Specifically, we generate 10 labeled datasets Dp, and for each Dp, we create 10 test datasets Du t and 10 additional datasets (e.g., Du q or Ds) using different random seeds, resulting in a total of 100 repeated experiments. Table 2: Data Descriptions in Three Complex Classification Scenarios Dataset n d M Long-tailed Domain Adaptation Transfer Learning np nt np nq nt np ns nt Dionis 416188 61 355 14200 7100 14200 7100 14200 14200 5000 7100 Gas Sensor 13910 128 6 3000 1000 3000 1000 3000 3000 1000 1000 Satimage 6430 36 6 1500 1000 2400 1200 1200 1500 1000 1000
Hardware Specification	No	The paper does not explicitly describe the specific hardware used for running its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	When fitting the KLR model to the source domain Dp, we select two key hyperparameters: the regularization parameter C and the kernel coefficient γ. These are determined using 5-fold cross-validation. Unlike the commonly used classification loss, we use the CE loss as the criterion for cross-validation, as our primary objective is to accurately estimate p(y\|x). Specifically, C is chosen from seven values evenly spaced on a logarithmic scale between 10 6 and 100, while γ is selected from seven values evenly spaced on a logarithmic scale between 2 6 and 20.