reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bridging Supervised Learning and Test-Based Co-optimization

Authors: Elena Popovici

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	As a proof of concept, a theoretical study is presented on the connection between existence / lack of free lunch in the two fields, showcasing a few ideas for improving computational complexity of certain supervised learning approaches. We structure the presentation in 3 incremental steps with respect to metric-evaluation costs: in Section 3.1 we describe the small-scale case of only one or two M -evaluations in total at most one of which is remaining in the budget; in Section 3.2 we progress to histories of arbitrarily-many already-evaluated interactions, but still at most one remaining to be evaluated; finally, in Section 3.3 we discuss the most general case of arbitrarily-large spent budgets and arbitrarily-large remaining budgets. In each section we first review the co-optimization perspective and results pertaining to free lunch and optimality, then derive and contrast their counterparts for supervised learning. We focus specifically on binary classification, but many of the ideas presented would apply in a multi-class situation. Throughout, we differentiate between the nature of free lunches for output mechanisms versus exploration mechanisms. We keep the presentation as informal as possible and support it with small but concrete examples; precise mathematical definitions of all concepts involved and proofs of the results can be found in the accompanying Online Appendix A.
Researcher Affiliation	Industry	Elena Popovici EMAIL Icosystem Corp. 222 Third Street, Suite 0142 Cambridge, MA 02142, USA
Pseudocode	No	The paper describes algorithms conceptually and presents mathematical derivations and examples, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code, nor does it provide links to a code repository for the methodology described.
Open Datasets	No	The paper discusses problem examples such as 'Photo Labeling' and 'Protein Structure' to illustrate concepts, and uses abstract 'data D' in its theoretical examples and mathematical derivations. However, it does not use or provide access information for any publicly available or open datasets for empirical evaluation.
Dataset Splits	No	Since the paper focuses on theoretical derivations and uses abstract data examples (like 'data D' in section 3.1.2) rather than empirical datasets, it does not provide specific dataset split information.
Hardware Specification	No	The paper is a theoretical study and does not describe any experimental setup that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and focuses on mathematical derivations and conceptual discussions rather than implemented experiments. Consequently, it does not list any specific software dependencies with version numbers.
Experiment Setup	No	The paper presents a theoretical study with mathematical derivations and conceptual examples. It does not include an experimental section or details on hyperparameters, training configurations, or system-level settings.