reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Density Estimation Using the Perceptron

Authors: Patrik Róbert Gerber, Tianze Jiang, Yury Polyanskiy, Rui Sun

JMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We propose a new density estimation algorithm. Given n i.i.d. observations from a distribution belonging to a class of densities on Rd, our estimator outputs any density in the class whose perceptron discrepancy with the empirical distribution is at most O( p d/n). The perceptron discrepancy is deﬁned as the largest difference in mass two distribution place on any halfspace. It is shown that this estimator achieves the expected total variation distance to the truth that is almost minimax optimal over the class of densities with bounded Sobolev norm and Gaussian mixtures.
Researcher Affiliation	Academia	Patrik R obert Gerber EMAIL Department of Mathematics Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA 02139, USA Tianze Jiang EMAIL Operations Research and Financial Engineering Princeton University 98 Charlton St, Princeton, NJ, 08540, USA Yury Polyanskiy EMAIL Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology 32 Vassar St, Cambridge, MA 02139, USA Rui Sun EMAIL Department of Statistics Stanford University 450 Jane Stanford Way, Stanford, CA 94305, USA
Pseudocode	No	The paper describes algorithms and methods using mathematical notation and textual explanations, but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described. It contains a license for the paper itself, but no statement or link regarding code release.
Open Datasets	No	The paper is theoretical and does not utilize specific datasets for empirical evaluation. It defines classes of distributions (PS(β, d, C) and PG(d)) for its theoretical analysis but does not mention any publicly available datasets.
Dataset Splits	No	The paper is theoretical and does not involve experiments with datasets, therefore, no dataset splits are discussed or provided.
Hardware Specification	No	The paper is theoretical and focuses on mathematical proofs and algorithmic design. It does not describe any experimental setup or the hardware used to run experiments.
Software Dependencies	No	The paper is theoretical and primarily describes mathematical concepts and algorithms without detailing any specific software implementations or dependencies with version numbers.
Experiment Setup	No	The paper is theoretical in nature, presenting algorithms, theorems, and proofs. It does not describe a practical experimental setup with hyperparameters or training configurations.