reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Privately Counting Partially Ordered Data

Authors: Matthew Joseph, Mónica Ribero, Alexander Yu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section evaluates our K-norm mechanism on a variety of poset structures . First, we derive a general result about the squared expected ℓ2 norm of ℓp balls (Section 4.1). Based on this result, the strongest comparison for our algorithm is the ℓ mechanism. We use this as the baseline for experiments on path posets, random posets (Section 4.3) and the National Health Interview Survey (Services & Medicaid, 2024) (Section 4.4). An evaluation of runtime appears in Section 4.5.
Researcher Affiliation	Industry	Matthew Joseph, M onica Ribero & Alexander Yu Google Research NY EMAIL.
Pseudocode	Yes	Pseudocode appears in Algorithm 1. Algorithm 1 Poset Ball Sampler 1: Input: Poset P satisfying Assumption 2.8 2: Uniformly sample an extended bipartition (N+, N , A, B) (Lemma 3.14) 3: Convert (N+, N , A, B) into its non-interfering chain C = (C+, C ) (Lemma 3.13) 4: Compute the vertices of the simplex F(C) = F(C+) F(C ) (Lemma 3.11) 5: Return a uniform sample from F(C) (Lemma 3.6)
Open Source Code	No	The paper does not contain an explicit statement about releasing source code for the methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	The ﬁnal set of experiments uses the National Health Interview Survey (NHIS) (Services & Medicaid, 2024). As described in the introduction, the survey includes or omits certain questions depending on previous answers. This induces a poset, and our experiments use either the ﬁrst, ﬁrst two, or ﬁrst three sections of the survey (Hypertension, Cholesterol, and Asthma). The resulting posets have size from d = 4 to d = 15. As shown in Figure 6, our mechanism more roughly halves the error of the baseline mechanism.
Dataset Splits	No	The paper mentions running experiments for a certain number of trials (e.g., '100 trials', '10,000 trials') and generating random posets, but does not provide specific train/test/validation splits for a fixed dataset, nor does it refer to standard predefined splits for machine learning tasks.
Hardware Specification	No	On a 2 CPU machine with 32GB RAM, our method takes less than half a second for any of the d used in our experiments. While the memory amount (32GB RAM) and the number of CPUs (2 CPU) are mentioned, specific CPU model or processor types are not provided.
Software Dependencies	No	The paper does not explicitly mention any specific software dependencies, libraries, or solvers with version numbers.
Experiment Setup	No	The paper discusses the theoretical framework and experimental results, including comparisons of noise mechanisms, but does not specify concrete experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or specific optimizer settings typically found in machine learning experiments. It mentions 'fixing privacy parameters' but does not state their values.