Kernel Smoothing, Mean Shift, and Their Learning Theory with Directional Data

Authors: Yikun Zhang, Yen-Chi Chen

JMLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the applicability of the algorithm, we evaluate it as a mode clustering method on both simulated and real-world data sets. [...] Simulation studies and applications to real-world data sets are unfolded in Section 6.
Researcher Affiliation Academia Yikun Zhang EMAIL Yen-Chi Chen EMAIL Department of Statistics University of Washington Seattle, WA 98195, USA
Pseudocode Yes Algorithm 1 Mean Shift Algorithm with Directional Data
Open Source Code Yes All the code for our experiments is available at https://github.com/zhangyk8/Dir MS.
Open Datasets Yes Martian crater data are publicly available on the Gazetteer of Planetary Nomenclature database (https://planetarynames.wr.usgs.gov/Advanced Search) of the International Astronomical Union (IUA). [...] The earthquake data can be obtained from the Earthquake Catalog (https://earthquake.usgs.gov/earthquakes/search/) of the United States Geological Survey.
Dataset Splits No The paper describes generating '1000 data points' or using datasets with specified total counts (e.g., '1653 craters', '1666 earthquakes') for mode clustering, but does not provide specific training/test/validation splits. Mode clustering is generally applied to the entire dataset.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Matplotlib Basemap Toolkit' but does not specify a version number. Other software dependencies or their versions are not provided.
Experiment Setup Yes Unless stated otherwise, we use the von Mises kernel L(r) = e r in the directional KDE (2) to estimate the directional densities and their derivatives. [...] the default bandwidth parameter is selected via the rule of thumb in Proposition 2 in Garc ıa-Portugu es (2013) [...]. The estimated concentration parameter bν is given by (4.4) in Banerjee et al. (2005) [...]. In addition, the tolerance level for terminating the algorithm is set to ϵ = 10 7.