reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

The Empirical Mean is Minimax Optimal for Local Glivenko-Cantelli

Authors: Doron Cohen, Aryeh Kontorovich, Roi Weiss

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To support our theoretical findings, we present two sets of simulations. The first demonstrates the tightness of the lower bound in Theorem 2.2, while the second highlights a specific setting where the simple average estimator outperforms the Empirical Mean Estimator (EME), complementing the results of Theorem 2.3. ... Figure 2 shows the results. The empirical deviations (dashed lines) closely follow the theoretical bounds (solid lines), confirming the tightness of the lower bound in Theorem 2.2. As expected, larger values of J lead to smoother empirical curves, emphasizing the role of averaging in reducing variance. Notably, the empirical deviations converge to the theoretical decay rate as n grows. ... Figure 3. Error comparison between the EME and the simple average estimator for varying sample sizes n under different distributions: uniform, triangular, Beta(2,2), exponential, 1/n, and Gaussian.
Researcher Affiliation	Academia	1Department of Computer Science, Ben-Gurion University of the Negev (BGU), Israel 2Department of Computer Science, Ariel University, Israel. Correspondence to: Doron Cohen <EMAIL>.
Pseudocode	No	The paper describes methods and proofs using mathematical notation and textual explanations, but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statement regarding the availability of source code, nor does it include links to a repository or mention code in supplementary materials.
Open Datasets	No	The simulations in Section A use standard mathematical distributions (uniform, triangular, Beta(2,2), exponential, 1/n, and Gaussian) for generating data, but the paper does not specify any publicly available or open datasets that require concrete access information.
Dataset Splits	No	The paper's simulation section describes generating data from mathematical distributions (e.g., uniform, Gaussian) for empirical evaluation, but it does not utilize or define specific training, validation, or test splits for a named dataset.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the simulations or other computations.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers used for the implementation or simulations.
Experiment Setup	Yes	We consider six values of q: q = 0.1, q = 0.2, q = 0.05, q = 0.01, q = 0.005, and q = 0.002. For each configuration, empirical results are averaged over J = 100, 1000, and 10000 repetitions to ensure stability. ... We evaluate the performance of the EME and the simple average estimator under six different distributions: uniform, triangular, Beta(2,2), exponential, 1/n-scaled, and Gaussian. For each distribution, we vary the number of trials k {10, 50, 100, 500} and compute the error as a function of the sample size n.