Bayesian Entropy Estimation for Countable Discrete Distributions
Authors: Evan Archer, Il Memming Park, Jonathan W. Pillow
JMLR 2014 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we explore the theoretical properties of the resulting estimator, and show that it performs well both in simulation and in application to real data. |
| Researcher Affiliation | Academia | Evan Archer EMAIL Center for Perceptual Systems The University of Texas at Austin, Austin, TX 78712, USA Max Planck Institute for Biological Cybernetics Spemannstrasse 41 72076 T ubingen, Germany Il Memming Park EMAIL Center for Perceptual Systems The University of Texas at Austin, Austin, TX 78712, USA Jonathan W. Pillow EMAIL Department of Psychology, Section of Neurobiology, Division of Statistics and Scientific Computation, and Center for Perceptual Systems The University of Texas at Austin, Austin, TX 78712, USA |
| Pseudocode | No | The paper contains detailed mathematical derivations and descriptions of methods like stick-breaking, but does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | A MATLAB implementation of the PYM estimator is available at https://github.com/pillowlab/PYMentropy. |
| Open Datasets | Yes | Figure 2: Empirical cumulative distribution functions of words in natural language (left) and neural spike patterns (right). ... (left) Frequency of N = 217826 words in the novel Moby Dick by Herman Melville. ... (right) Frequencies among N = 1.2 106 neural spike words from 27 simultaneously-recorded retinal ganglion cells... (Pillow et al., 2005). We tokenized the novel into individual words using the Python library NLTK. ... We thank E. J. Chichilnisky, A. M. Litke, A. Sher and J. Shlens for retinal data... |
| Dataset Splits | No | In each simulation, we draw 10 sample distributions π. From each π we draw a data set of N iid samples. ... For Moby Dick, PYM slightly overestimates, while DPM slightly underestimates... The neural data were preprocessed to be a binarized response... The paper focuses on applying estimators to samples rather than defining train/test splits for model training. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments or simulations. |
| Software Dependencies | No | We tokenized the novel into individual words using the Python library NLTK. ... A MATLAB implementation of the PYM estimator is available... The paper mentions software tools (NLTK, MATLAB) but does not provide specific version numbers for them. |
| Experiment Setup | No | The paper describes the theoretical framework of the PYM estimator and its application to data, including how samples are drawn for simulations. However, it does not provide specific hyperparameter values, training configurations, or system-level settings for model training as might be found in a typical experimental setup. |