reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stability and Generalization in Structured Prediction

Authors: Ben London, Bert Huang, Lise Getoor

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our primary contribution is a new PAC-Bayesian analysis of structured prediction, producing generalization bounds that decrease when either the number of examples, m, or the size of each example, n, increase. Under suitable conditions, our bounds can be as tight as O (1/ mn). Our results apply to any composition of loss function and hypothesis class that satisﬁes our local stability conditions, which includes a broad range of modeling regimes used in practice. We also propose a novel view of PAC-Bayesian derandomization, based on the principle of stability, which provides a general proof technique for converting a generalization bound for a randomized structured predictor into a bound for a deterministic structured predictor. As part of our analysis, we derive a new bound on the moment-generating function of a locally stable functional.
Researcher Affiliation	Academia	Ben London EMAIL University of Maryland Bert Huang EMAIL Virginia Tech Lise Getoor EMAIL University of California, Santa Cruz
Pseudocode	No	The paper presents mathematical derivations, theorems, and proofs for PAC-Bayesian generalization bounds but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to any code repositories.
Open Datasets	No	The paper is theoretical and focuses on deriving generalization bounds. It discusses 'data distributions' and 'training examples' in a general sense within its theoretical framework, but it does not specify or provide access information for any particular dataset used in empirical studies.
Dataset Splits	No	The paper is theoretical and does not present empirical experiments that would require dataset splits for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and focuses on mathematical derivations of generalization bounds. It does not describe any experiments that would require specific hardware for execution.
Software Dependencies	No	The paper is theoretical and presents mathematical proofs and derivations. It does not describe any computational experiments that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and derives new PAC-Bayesian generalization bounds. It does not contain details about an experimental setup, hyperparameters, or training configurations.