reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Differentially Private n-gram Extraction

Authors: Kunho Kim, Sivakanth Gopi, Janardhan Kulkarni, Sergey Yekhanin

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically evaluate the performance of our algorithms on two datasets: Reddit and MSNBC.
Researcher Affiliation	Industry	Kunho Kim Microsoft EMAIL Sivakanth Gopi Microsoft Research EMAIL Janardhan Kulkarni Microsoft Research EMAIL Sergey Yekhanin Microsoft Research EMAIL
Pseudocode	Yes	In this section we describe our algorithm for DPNE. The pseudocode is presented in Algorithm 1.
Open Source Code	Yes	Code available at https://github.com/microsoft/differentially-private-ngram-extraction
Open Datasets	Yes	The Reddit data set is a natural language dataset used extensively in NLP applications, and is taken from Tensor Flow repository.5 The MSNBC dataset consists page visits of users who browsed msnbc.com on September 28, 1999, and is recorded at the level of URL and ordered by time.6
Dataset Splits	No	The paper uses Reddit and MSNBC datasets but does not specify how these datasets were split into training, validation, or test sets for their experiments. No explicit percentages, counts, or references to standard splits are provided for reproducibility.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with their version numbers, such as programming languages, libraries, or frameworks used for implementation or experimentation.
Experiment Setup	Yes	Throughout this section we ﬁx T = 9, ε = 4, δ = 10 7, 1 = = 9 = 0 = 300, η = 0.01 unless otherwise speciﬁed.