Consistent Estimation of Identifiable Nonparametric Mixture Models from Grouped Observations

Authors: Alexander Ritchie, Robert A. Vandermeulen, Clayton Scott

NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we compare our coreset approach against several competing methods on a number of real and highly overlapping synthetic datasets.
Researcher Affiliation Academia Alexander Ritchie Department of EECS University of Michigan Ann Arbor, MI 48109 EMAIL Robert A. Vandermeulen ML group Technische Universität Berlin 10587 Berlin, Germany EMAIL Clayton Scott Departments of EECS, Statistics University of Michigan Ann Arbor, MI 48109 EMAIL
Pseudocode Yes Pseudocode for the APSGD algorithm for solving (6) is given in the supplementary material.
Open Source Code No All code and synthetic datasets are publicly available.2 [Footnote 2: Authors Git Hub link to go here in final version.]
Open Datasets Yes All code and synthetic datasets are publicly available.2 The MAGIC gamma ray detection dataset [33] is publicly available via the UCI machine learning repository. The Russian-troll-tweets Twitter dataset is publicly available through Five Thirty Eight.3
Dataset Splits Yes Each method was trained using 80% of the available data, and the ROC curve was generated from the remaining 20%.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies No The paper does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes For synthetic experiments, R was selected to yield the initialization with the lowest empirical TISE. R was chosen from {10, 20, 30, 40, 50} for both moons datasets, and from {60, 70, 80, 90, 100} for the Olympic rings and half-disks datasets. We used R = 200 for the MAGIC and Twitter datasets.