Neural Estimation of Statistical Divergences

Authors: Sreejith Sreekumar, Ziv Goldfeld

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We establish non-asymptotic absolute error bounds for a neural estimator realized by a shallow NN, focusing on four popular f-divergences Kullback-Leibler, chi-squared, squared Hellinger, and total variation. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory to bound the two sources of error involved: function approximation and empirical estimation. The bounds characterize the effective error in terms of NN size and the number of samples, and reveal scaling rates that ensure consistency. For compactly supported distributions, we further show that neural estimators of the first three divergences above with appropriate NN growth-rate are minimax rate-optimal, achieving the parametric convergence rate.
Researcher Affiliation Academia Sreejith Sreekumar EMAIL Electrical and Computer Engineering Department Cornell University Ithaca, NY 14850, USA; Ziv Goldfeld EMAIL Electrical and Computer Engineering Department Cornell University Ithaca, NY 14850, USA
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks. The methodologies described are purely theoretical, involving mathematical formulations and proofs without structured algorithmic steps.
Open Source Code No The paper does not provide any explicit statement about the release of open-source code for the methodology described, nor does it include a link to a code repository.
Open Datasets No This paper is theoretical and focuses on error bounds and minimax optimality for neural estimators of statistical divergences. It does not conduct experiments on specific datasets and therefore does not provide access information for any open datasets.
Dataset Splits No The paper is theoretical and does not conduct empirical experiments requiring dataset splits. Therefore, no information on training/test/validation dataset splits is provided.
Hardware Specification No The paper presents theoretical analysis and does not report on experimental results that would require specific hardware. Therefore, no hardware specifications are provided.
Software Dependencies No The paper focuses on theoretical contributions and does not describe implementation details or experimental setups that would list specific software dependencies with version numbers.
Experiment Setup No This is a theoretical paper that does not include experimental evaluations. Consequently, there are no details provided regarding experimental setup, hyperparameters, or system-level training settings.