reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentially Private Data Releasing for Smooth Queries

Authors: Ziteng Wang, Chi Jin, Kai Fan, Jiaqi Zhang, Junliang Huang, Yiqiao Zhong, Liwei Wang

JMLR 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also develop practically eﬃcient variants of the mechanisms with promising experimental results. Finally we conduct experiments on the eﬃcient variant of the mechanism which outputs synthetic database as it may be more useful in practice. Experimental results demonstrate that the algorithm achieves good accuracy and are practically eﬃcient on datasets of various sizes and of a number of attributes. Section 5.2.1 Experimental Results
Researcher Affiliation	Academia	Key Laboratory of Machine Perception (MOE), School of EECS Peking University Beijing, 100871, China; Department of Computer Science University of California Berkeley, CA 94720-1776, USA; Computational Biology & Bioinformactics Duke University Durham, NC 27708, USA; School of Mathematical Sciences Peking University Beijing, 100871, China
Pseudocode	Yes	Algorithm 1 Outputting the summary Algorithm 2 Answering a query Algorithm 3 Private Synthetic DB for Smooth Queries Algorithm 4 Private Subspace Iteration (Hardt, 2013)
Open Source Code	Yes	1. Source codes available at http://www.cis.pku.edu.cn/faculty/vision/wangliwei/software.html
Open Datasets	Yes	We adopt three datasets all from the UCI repository. A summary of the size and the number of attributes of these datasets is given in Table 3.
Dataset Splits	Yes	We randomly partition each dataset into two subsets of equal size. One subset is used as training dataset, the other as test dataset.
Hardware Specification	Yes	The computer used in all the experiments is a workstation with 2 Intel Xeon X5650 processors of 2.67GHz and 32GB RAM.
Software Dependencies	No	The paper mentions learning an SVM classifier and numerical integration methods, but does not provide specific software names with version numbers for reproducibility.
Experiment Setup	Yes	Detailed parameter setting of the query functions is as follows. We consider P J j=1 αj exp x xj 2 . In all experiments, we set J = 10; αj is randomly chosen from [0, 1], and xj is randomly chosen from [ 1, 1]d. We test various values of σ and various ϵ to see how the smoothness of the query function and privacy parameter aﬀect the performance of the algorithm (see below for detailed results). We test three values of ϵ, and the performances are given in Table 5.