reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Active Nearest-Neighbor Learning in Metric Spaces

Authors: Aryeh Kontorovich, Sivan Sabato, Ruth Urner

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classiﬁer. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is signiﬁcantly lower than that of any passive learner with similar error guarantees. MARMANN is based on a generalized sample compression scheme, and a new label-efﬁcient active model-selection procedure.
Researcher Affiliation	Academia	Aryeh Kontorovich EMAIL Sivan Sabato EMAIL Department of Computer Science Ben-Gurion University of the Negev Beer Sheva 8499000, Israel Ruth Urner EMAIL Lassonde School of Engineeging, EECS Department York University Toronto, ON, Canada
Pseudocode	Yes	Algorithm 1 MARMANN: MArgin Regularized Metric Active Nearest Neighbor Algorithm 2 Generate NNSet(t, I, δ) Algorithm 3 Est Ber(θ, β, δ) Algorithm 4 Select Scale(δ)
Open Source Code	No	We note that implementing even this passive component efﬁciently is far from trivial; this formed the core of a Master s thesis (Korsunsky, 2017). The remaining obstacle to making our algorithm fully practical is the magnitude of some of the constants. We believe these to be artifacts of the proof and intend to bring them down to manageable values in future work.
Open Datasets	No	The paper focuses on theoretical development and proofs for an algorithm in general metric spaces. It does not mention the use of any specific datasets for experimental evaluation.
Dataset Splits	No	The paper does not describe any experimental evaluation using specific datasets, and therefore no information on dataset splits is provided.
Hardware Specification	No	The paper is theoretical in nature, focusing on algorithm design and proofs, and does not describe any experimental setup or hardware used.
Software Dependencies	No	The paper is theoretical, presenting an algorithm and its guarantees. It does not describe an experimental implementation with software dependencies.
Experiment Setup	No	The paper details a theoretical algorithm and its mathematical guarantees. It does not include an experimental section with specific setup details like hyperparameters or training configurations.