Distribution Estimation under the Infinity Norm
Authors: Aryeh Kontorovich, Amichai Painsky
JMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate our results in synthetic and real-world experiments. Finally, we apply our proposed framework to a basic selective inference problem, where we estimate the most frequent probabilities in a sample. |
| Researcher Affiliation | Academia | Aryeh Kontorovich EMAIL Department of Computer Science Ben Gurion University of the Negev Beer Sheva, Israel Amichai Painsky EMAIL School of Industrial Engineering Tel Aviv University Tel Aviv, Israel |
| Pseudocode | No | The paper describes algorithms and methods but does not present any structured pseudocode or algorithm blocks with specific steps. |
| Open Source Code | No | The paper does not contain any explicit statement about the release of source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We begin with a census data; we consider the 2000 United States Census (Bureau, 2014), which lists the frequency of the top 1000 most common last names in the United States. |
| Dataset Splits | No | The paper mentions "In each experiment we draw n samples from a distribution over an alphabet size A to evaluate the proposed bounds for a given confidence level 1 δ. We repeat this process 10^4 times and report the average bound and coverage rate...". This describes data sampling and repetition, but not specific training/test/validation splits for machine learning experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not specify any particular software libraries, frameworks, or their version numbers used in the experiments. |
| Experiment Setup | Yes | We set s = 1.1 throughout our experiments. In the first experiment we focus on A = 100 and δ = 0.05. We examine three bounds. First, we consider the bound from Theorem 2 with δ1 = 0.99δ, δ2 = 0.01δ. We set m to minimize (13) over the worst-case distribution (see (10)). This results in m = log(1/δ1) + 2 ≈ 8. Further, we examine Theorem 6 and the benchmark bound (1). ... Next, we examine the performance of our proposed schemes for a decaying confidence level, δ = 1/n^2. As above, we set A = 100 and focus on the two benchmark distributions. ... for a typical experiment of n = 10000 |