reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sample-Optimal Agnostic Boosting with Unlabeled Data

Authors: Udaya Ghai, Karan Singh

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we demonstrate the empirical viability of our approach. Table 1 showcases the results from our initial experiments comparing Algorithm 1 with the agnostic boosting method introduced by Kanade & Kalai (2009), herein referred to as the Potential-based Agnostic Booster (PAB). These evaluations were performed on various UCI classification datasets (Sigillito et al., 1989; Hopkins et al., 1999; Smith et al., 1988; Hofmann, 1994; Sejnowski & Gorman, 1988; Breiman & Stone, 1984), employing decision stumps (Pedregosa et al., 2011) as the weak learners.
Researcher Affiliation	Collaboration	1Amazon, NYC 2Tepper School of Business, Carnegie Mellon University. Correspondence to: Karan Singh <EMAIL>.
Pseudocode	Yes	Algorithm 1 Agnostic Boosting with Unlabeled Data
Open Source Code	No	The paper does not provide any specific statement or link regarding the availability of source code for the methodology described.
Open Datasets	Yes	These evaluations were performed on various UCI classification datasets (Sigillito et al., 1989; Hopkins et al., 1999; Smith et al., 1988; Hofmann, 1994; Sejnowski & Gorman, 1988; Breiman & Stone, 1984)
Dataset Splits	Yes	Table 1. 50-fold cross-validated accuracies of the Potential based Agnostic Booster (PAB) (Kanade & Kalai, 2009) and our proposed boosting algorithm on six datasets with 0%, 5%, 10%, and 20% added label noise (during training). Sonar and Ionosphere have 50% of labels dropped while the remaining datasets have 90% of labels dropped.
Hardware Specification	Yes	Experiments are all run on an M1 Macbook Pro and complete within an hour.
Software Dependencies	No	employing decision stumps (Pedregosa et al., 2011) as the weak learners. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825 2830, 2011. The paper mentions scikit-learn for decision stumps but does not specify a version number.
Experiment Setup	Yes	For PAB, the number of samples that can be fed to a week learner in a round scales inversely with the number of boosting rounds, as the algorithm requires fresh samples each round.As such, we perform a grid search on the number of boosting rounds with T {25, 50, 100}, while we just use 100 for our implementation of Algorithm 1. In both algorithms we search over the parameter m, the number of samples we feed to the weak learner each round with a grid of {5, 20, 50, 100}, though if such a setting is invalid for PAB, we continue until all samples are used.