reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Bounded Space Differentially Private Quantiles

Authors: Daniel Alabi, Omri Ben-Eliezer, Anamay Chaturvedi

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets.
Researcher Affiliation	Academia	Daniel Alabi EMAIL Columbia University Omri Ben-Eliezer EMAIL Massachusetts Institute of Technology Anamay Chaturvedi EMAIL Northeastern University
Pseudocode	Yes	Algorithm 1: DPExp GK: Exponential Mechanism DP Quantiles : High Level Description Data: X = (x1, x2, . . . , xn) Input: ϵ, α (approximation parameter), q [0, 1] (quantile parameters), δu (sensitivity)
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It states "We implement our algorithms and experimentally evaluate them on synthetic and real-world datasets." but no link or explicit statement of code release is found.
Open Datasets	Yes	We repeat our investigation of the utility and space complexity comparison between DPExp GKGumb and DPExp Full with the following real-world data sets (Dua & Graﬀ, 2017): (1) Taxi Service Trajectory: A dataset from the UCI machine learning repository describing trajectories performed by all 442 taxis (at the time) in the city of Porto in Portugal (Moreira-Matias et al., 2013). (2) Gas Sensor Dataset: A UCI repository dataset containing recordings of 16 chemical sensors exposed to varying concentrations of two gas mixtures (Fonollosa et al., 2015).
Dataset Splits	No	The paper does not provide specific details on training, validation, or test dataset splits. It mentions evaluating on synthetic and real-world datasets and conducting '100 trials' but does not specify how the data was partitioned for these evaluations.
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments. It mentions implementing and evaluating algorithms but does not specify the computing environment or components.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers. It mentions implementing algorithms but does not list any programming languages, libraries, or frameworks with version details.
Experiment Setup	No	The paper mentions varying parameters such as 'stream length n', 'Approximation Parameter α', and 'Data Distribution' in Section 6, but it does not specify concrete hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, optimizers) for its experiments.