reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Classification with Valid and Adaptive Coverage

Authors: Yaniv Romano, Matteo Sesia, Emmanuel Candes

NeurIPS 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic and real data demonstrate the practical value of our theoretical guarantees, as well as the statistical advantages of the proposed methods over the existing alternatives.
Researcher Affiliation	Academia	Yaniv Romano Department of Statistics Stanford University Stanford, CA, USA EMAIL Matteo Sesia Department of Data Sciences and Operations University of Southern California Los Angeles, CA, USA EMAIL Emmanuel J. Candès Departments of Mathematics and of Statistics Stanford University Stanford, CA, USA EMAIL
Pseudocode	Yes	Algorithm 1: Adaptive classiﬁcation with split-conformal calibration
Open Source Code	Yes	The Python package at https://github.com/msesia/arc implements our methods. This repository also contains code to reproduce our experiments.
Open Datasets	Yes	The methods are tested on two well-known data sets: the Mice Protein Expression data set3 and the MNIST handwritten digit data set. (Footnote 3: https://archive.ics.uci.edu/ml/datasets/Mice+Protein+Expression)
Dataset Splits	Yes	Algorithm 1: Input: data {(Xi, Yi)}n i=1... Randomly split the training data into 2 subsets, I1, I2. ... Algorithm 2: Input: data {(Xi, Yi)}n i=1... Randomly split the training data into K disjoint subsets, I1, . . . , IK, each of size n/K.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	We compare the performances of Algorithms 1 (SC) and 2 (CV+, JK+)... We explore 3 different black-boxes: an oracle... a support vector classiﬁer (SVC) implemented by the sklearn Python package; and a random forest classiﬁer (RFC) also implemented by sklearn...
Experiment Setup	Yes	We ﬁx α = 0.1 and assess performance in terms of marginal coverage, conditional coverage, and mean cardinality of the prediction sets.