reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simultaneous Private Learning of Multiple Concepts

Authors: Mark Bun, Kobbi Nissim, Uri Stemmer

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We investigate the direct-sum problem in the context of diﬀerentially private PAC learning: What is the sample complexity of solving k learning tasks simultaneously under diﬀerential privacy, and how does this cost compare to that of solving k learning tasks without privacy? In our setting, an individual example consists of a domain element x labeled by k unknown concepts (c1, . . . , ck). The goal of a multi-learner is to output k hypotheses (h1, . . . , hk) that generalize the input examples. Without concern for privacy, the sample complexity needed to simultaneously learn k concepts is essentially the same as needed for learning a single concept. Under diﬀerential privacy, the basic strategy of learning each hypothesis independently yields sample complexity that grows polynomially with k. For some concept classes, we give multi-learners that require fewer samples than the basic strategy. Unfortunately, however, we also give lower bounds showing that even for very simple concept classes, the sample cost of private multi-learning must grow polynomially in k. Keywords: Diﬀerential privacy, PAC learning, Agnostic learning, Direct-sum
Researcher Affiliation	Academia	Mark Bun EMAIL Department of Computer Science Boston University, Kobbi Nissim EMAIL Department of Computer Science Georgetown University, Uri Stemmer EMAIL Department of Computer Science Ben-Gurion University
Pseudocode	Yes	Algorithm 1 Adist Input: Privacy parameters ϵ, δ, database S X , sensitivity-1 quality function q... Algorithm 2 Query release for POINTX Input: Privacy parameters (ϵ, δ), database D Xn... Algorithm 3 Generic Learner Input: Concept class C, privacy parameters ϵ , ϵ, δ, and a k-labeled database S = (xi, yi,1, . . . , yi,k)n i=1... Algorithm 4 Parity Learner Input: Parameters ϵ, δ, and a k-labeled database S of size n = O(d βδ))... Algorithm 5 Point Learner Input: Privacy parameters ϵ, δ, and a k-labeled database S = (xi, yi,1, . . . , yi,k)n i=1.
Open Source Code	No	The paper focuses on theoretical analysis, definitions, theorems, and proofs. There is no mention of any code release, repository links, or supplementary materials containing code for the methodologies described.
Open Datasets	No	The paper is theoretical and discusses abstract concepts like 'domain X' and 'k-labeled database' for its analyses. It does not perform experiments on specific, concrete datasets, nor does it provide access information for any datasets.
Dataset Splits	No	The paper is theoretical and does not involve empirical experiments with specific datasets. Therefore, there is no discussion or provision of dataset splits for training, validation, or testing.
Hardware Specification	No	This paper is purely theoretical, focusing on mathematical proofs, algorithms, and sample complexity analysis. It does not describe any experiments that would require specific hardware, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not include any experimental section or implementation details that would require listing specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and presents algorithms and proofs rather than empirical experiments. Consequently, it does not include details such as hyperparameter values, training configurations, or system-level settings for an experimental setup.