Consistent Estimation of Identifiable Nonparametric Mixture Models from Grouped Observations
Authors: Alexander Ritchie, Robert A. Vandermeulen, Clayton Scott
NeurIPS 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we compare our coreset approach against several competing methods on a number of real and highly overlapping synthetic datasets. |
| Researcher Affiliation | Academia | Alexander Ritchie Department of EECS University of Michigan Ann Arbor, MI 48109 EMAIL Robert A. Vandermeulen ML group Technische Universität Berlin 10587 Berlin, Germany EMAIL Clayton Scott Departments of EECS, Statistics University of Michigan Ann Arbor, MI 48109 EMAIL |
| Pseudocode | Yes | Pseudocode for the APSGD algorithm for solving (6) is given in the supplementary material. |
| Open Source Code | No | All code and synthetic datasets are publicly available.2 [Footnote 2: Authors Git Hub link to go here in final version.] |
| Open Datasets | Yes | All code and synthetic datasets are publicly available.2 The MAGIC gamma ray detection dataset [33] is publicly available via the UCI machine learning repository. The Russian-troll-tweets Twitter dataset is publicly available through Five Thirty Eight.3 |
| Dataset Splits | Yes | Each method was trained using 80% of the available data, and the ROC curve was generated from the remaining 20%. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | For synthetic experiments, R was selected to yield the initialization with the lowest empirical TISE. R was chosen from {10, 20, 30, 40, 50} for both moons datasets, and from {60, 70, 80, 90, 100} for the Olympic rings and half-disks datasets. We used R = 200 for the MAGIC and Twitter datasets. |