Estimating Density Models with Truncation Boundaries using Score Matching

Authors: Song Liu, Takafumi Kanamori, Daniel J. Williams

JMLR 2022 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the usefulness of our method via numerical experiments and a study on the Chicago crime data set. We also show that the proposed density estimation can correct the outlier-trimming bias caused by aggressive outlier detection methods. Section 8 is titled "Numerical and Real-world Data Analysis" and includes various experiments with datasets and performance comparisons.
Researcher Affiliation Academia Song Liu EMAIL University of Bristol; Takafumi Kanamori EMAIL Tokyo Institute of Technology, RIKEN AIP; Daniel J. Williams EMAIL University of Bristol. All listed affiliations are academic institutions.
Pseudocode No The paper describes its methodology using mathematical formulations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code and data sets to reproduce our experiments are available at https://github.com/anewgithubname/ Truncated-Score-Matching.
Open Datasets Yes We demonstrate the usefulness of our method via numerical experiments and a study on the Chicago crime data set. The code and data sets to reproduce our experiments are available at https://github.com/anewgithubname/ Truncated-Score-Matching. We also experiment on a real-world data set, CIFAR-10, which contains ten different classes of 32 by 32 images.
Dataset Splits No The paper mentions generating samples (e.g., "We generate 10,000 samples, only 1417 of which can be used for parameter estimation"), and for CIFAR-10, it states using a "hold-out likelihood". However, specific percentages, absolute counts, or explicit methodologies for splitting data into training, validation, and test sets for reproducibility are not provided in the main text.
Hardware Specification Yes Our experiments are run on a workstation with an AMD Ryzen 1700 CPU with 32GB memory.
Software Dependencies No We optimize both objective functions using MATLAB s fminunc function with default settings. While MATLAB is mentioned, a specific version number for MATLAB itself is not provided.
Experiment Setup Yes Our unnormalized density model is a Gaussian mixture model with four components (parametrized by θ1, . . . , θ4) and the unit variance-covariance matrix: pθ1,...,θ4(x) = P4 i=1 Nx(θi, I). In this experiment, 500,000 particles are used to approximate ZV (θ). We fit a Gaussian mixture model with two components on this data set. The standard deviations of the two components are fixed to the same value, roughly half of the width of the city. The outlier percentage (ν) in OSVM is set to 20%.