Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data
Authors: Yikun Zhang, Yen-Chi Chen
JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the applicability of the algorithm, we evaluate it as a mode clustering method on both simulated and real-world data sets. [...] Simulation studies and applications to real-world data sets are unfolded in Section 6. |
| Researcher Affiliation | Academia | Yikun Zhang EMAIL Yen-Chi Chen EMAIL Department of Statistics University of Washington Seattle, WA 98195, USA |
| Pseudocode | Yes | Algorithm 1 Mean Shift Algorithm with Directional Data |
| Open Source Code | Yes | All the code for our experiments is available at https://github.com/zhangyk8/Dir MS. |
| Open Datasets | Yes | Martian crater data are publicly available on the Gazetteer of Planetary Nomenclature database (https://planetarynames.wr.usgs.gov/Advanced Search) of the International Astronomical Union (IUA). [...] The earthquake data can be obtained from the Earthquake Catalog (https://earthquake.usgs.gov/earthquakes/search/) of the United States Geological Survey. |
| Dataset Splits | No | The paper describes generating '1000 data points' or using datasets with specified total counts (e.g., '1653 craters', '1666 earthquakes') for mode clustering, but does not provide specific training/test/validation splits. Mode clustering is generally applied to the entire dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Matplotlib Basemap Toolkit' but does not specify a version number. Other software dependencies or their versions are not provided. |
| Experiment Setup | Yes | Unless stated otherwise, we use the von Mises kernel L(r) = e r in the directional KDE (2) to estimate the directional densities and their derivatives. [...] the default bandwidth parameter is selected via the rule of thumb in Proposition 2 in Garc ıa-Portugu es (2013) [...]. The estimated concentration parameter bν is given by (4.4) in Banerjee et al. (2005) [...]. In addition, the tolerance level for terminating the algorithm is set to ϵ = 10 7. |