reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PAC-learning for Strategic Classification

Authors: Ravi Sundaram, Anil Vullikanti, Haifeng Xu, Fan Yao

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We prove that any strategic classification problem is agnostic PAC learnable by the empirical risk minimization paradigm with O ϵ 2[d + log(1 δ))] samples, where d =SVC(H, R, c). Conceptually, this result illustrates that SVC correctly characterizes the learnability of the hypothesis class H in our strategic setup. Our SVC notion generalizes the adversarial VC-dimension (AVC) introduced in (Cullina et al., 2018) for adversarial learning with evasion attacks. Formally, we prove that AVC equals precisely SVC(H, R, c) for R = { 1, 1} when data points are allowed to move within region {z; c(z; x) 1} in the adversarial learning setup.
Researcher Affiliation	Academia	Ravi Sundaram EMAIL Khoury College of Computer Science Northeastern University Boston, MA 02115, USA; Anil Vullikanti EMAIL Department of Computer Science, Biocomplexity Institute University of Virginia Charlottesville, VA 22904, USA; Haifeng Xu EMAIL Department of Computer Science University of Chicago Chicago, IL 60637, USA; Fan Yao EMAIL Department of Computer Science University of Virginia Charlottesville, VA 22904, USA
Pseudocode	No	The paper includes mathematical definitions, theorems, and proofs but does not contain any clearly labeled pseudocode or algorithm blocks. For example, the Strategic Empirical Risk Minimization (SERM) is defined as a mathematical formulation, not an algorithm.
Open Source Code	No	The paper does not contain any explicit statements about releasing code, nor does it provide links to source code repositories. The paper concludes by stating "Another interesting follow-up question is how we can efficiently compute the optimal randomized linear classifier for strategic classification. It is challenging because it is unclear how to compute the best response of a data point against such a randomized classifier. We identify it as an intriguing future direction.", which implies no code is provided for their methods.
Open Datasets	No	The paper focuses on theoretical aspects of PAC-learning for strategic classification. It discusses conceptual data points as 'tuple (x, y, r) where x X and y { 1, +1} are the feature and label, respectively...'. There is no mention of specific datasets (e.g., CIFAR-10, ImageNet), links, DOIs, or citations to publicly available datasets used for empirical evaluation.
Dataset Splits	No	The paper is purely theoretical and does not conduct experiments on specific datasets. While it mentions 'training data' and 'testing data' in a conceptual sense within the theoretical framework, it does not specify any actual dataset splits (e.g., percentages, sample counts, or predefined splits) for reproducibility of experiments.
Hardware Specification	No	The paper is a theoretical work focusing on statistical and computational learnability. It does not describe any experiments that would require specific hardware, and thus, no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not mention any specific software, libraries, or frameworks with version numbers that would be required to reproduce experimental results.
Experiment Setup	No	The paper is entirely theoretical, presenting definitions, theorems, and proofs. It does not include any empirical experiments, and therefore, there are no details about experimental setup, hyperparameters, or training configurations.