reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Revisiting Model-Agnostic Private Learning: Faster Rates and Active Learning

Authors: Chong Liu, Yuqing Zhu, Kamalika Chaudhuri, Yu-Xiang Wang

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, our experiments on real-life datasets demonstrate that PATE-ASQ achieves signiﬁcantly better accuracy than standard PATE algorithms while incurring the same or lower privacy loss.
Researcher Affiliation	Academia	Chong Liu EMAIL Yuqing Zhu EMAIL Kamalika Chaudhuri EMAIL Yu-Xiang Wang EMAIL Department of Computer Science, University of California, Santa Barbara, CA 93106, USA Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
Pseudocode	Yes	Algorithm 1 Standard PATE (Papernot et al., 2018), Algorithm 2 SVT-based PATE (Bassily et al., 2018b), Algorithm 3 PATE-PSQ, Algorithm 4 Disagreement-Based Active Learning (Hanneke, 2014), Algorithm 5 PATE-ASQ
Open Source Code	No	The authors would like to thank Songbai Yan for sharing their code with a practical implementation of the disagreement-based active learning algorithms from Yan et al. (2018). (This refers to code from Yan et al. (2018) not the authors' own implementation for this paper. No other statement regarding code release is found.)
Open Datasets	Yes	Datasets. We do our experiments on three binary classiﬁcation datasets, mushroom, a9a, and real-sim. All of them are obtained from LIBSVM dataset website 6. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits	Yes	For all datasets, 80% of all data points are randomly selected to be considered private and used to train teacher classiﬁers. 2% of all data points are randomly selected as public student unlabeled data points. The remaining 18% data points are reserved for testing. We repeat these random selection processes for 30 times.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instance types) are provided in the paper.
Software Dependencies	No	All privacy accounting and calibration are conducted via Auto DP (Wang et al., 2019), and the tight analytical calibration and composition of Gaussian mechanisms are due to (Balle and Wang, 2018). (Specific version numbers for Auto DP or other software components are not provided.)
Experiment Setup	Yes	Number of teachers K is set on all datasets so that each teacher classiﬁer gets trained with approximately 100 data points. 30% of student unlabeled data points are set as the total budget of queries for PATE-ASQ and PATE-ASQ-NP. See Table 3. ϵ = 0.5, 1.0, 2.0 and δ = 1/n are set as privacy parameters for all datasets, where n is number of private teacher data points.