reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Optimal Sample Complexity of PAC Learning

Authors: Steve Hanneke

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This work establishes a new upper bound on the number of samples sufficient for PAC learning in the realizable case. The bound matches known lower bounds up to numerical constant factors. This solves a long-standing open problem on the sample complexity of PAC learning. The technique and analysis build on a recent breakthrough by Hans Simon. Keywords: sample complexity, PAC learning, statistical learning theory, minimax analysis, learning algorithm
Researcher Affiliation	Academia	The only author listed is Steve Hanneke with the email 'EMAIL'. No explicit institutional affiliation (university or company) is provided in the paper for the author. However, given that the paper is published in the Journal of Machine Learning Research, a prominent academic journal, it is highly probable that the author is affiliated with an academic institution, or is an independent researcher contributing to academia. For the purpose of classification into one of the given categories, we categorize it as 'Academia' due to the publication venue, despite the lack of direct institutional affiliation in the text.
Pseudocode	Yes	Now consider the following recursive algorithm, which takes as input two finite data sets, S and T, satisfying C[S T] = , and returns a finite sequence of data sets (referred to as subsamples of S T). Algorithm: A(S; T) 0. If \|S\| 3 1. Return {S T} 2. Let S0 denote the first \|S\| 3 \|S\|/4 elements of S, S1 the next \|S\|/4 elements, S2 the next \|S\|/4 elements, and S3 the remaining \|S\|/4 elements after that 3. Return A(S0; S2 S3 T) A(S0; S1 S3 T) A(S0; S1 S2 T)
Open Source Code	No	The paper does not provide any explicit statements about releasing source code, nor does it include links to a code repository or mention code in supplementary materials for the methodology described.
Open Datasets	No	The paper is theoretical in nature, focusing on sample complexity bounds for PAC learning. It does not conduct experiments with specific datasets and therefore does not provide concrete access information for any publicly available or open datasets.
Dataset Splits	No	The paper is theoretical and does not involve empirical experiments using datasets. Therefore, there is no mention of dataset splits (training, validation, test) or any methodology for data partitioning.
Hardware Specification	No	The paper is theoretical and focuses on mathematical proofs and algorithmic design. It does not describe any experimental results, and consequently, there is no mention of specific hardware specifications (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper is theoretical and does not report experimental results that would require specific software dependencies. While it discusses an 'efficient PAC learning algorithm for C', it does not list any programming languages, libraries, or solvers with version numbers.
Experiment Setup	No	The paper is theoretical and focuses on proving a new upper bound for PAC learning. It does not describe any empirical experiments, and thus, no specific experimental setup details such as hyperparameters, training configurations, or system-level settings are provided.