Minimum Density Hyperplanes
Authors: Nicos G. Pavlidis, David P. Hofmeyr, Sotiris K. Tasoulis
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate its performance on a range of benchmark data sets. The proposed approach is found to be very competitive with state of the art methods for clustering and semi-supervised classification. Experimental results are presented in Section 5 |
| Researcher Affiliation | Academia | Nicos G. Pavlidis EMAIL Department of Management Science Lancaster University Lancaster, LA1 4YX, UK; David P. Hofmeyr EMAIL Department of Mathematics and Statistics Lancaster University Lancaster, LA1 4YF, UK; Sotiris K. Tasoulis EMAIL Department of Applied Mathematics Liverpool John Moores University, Liverpool, L3 3AF, UK |
| Pseudocode | No | The paper describes algorithms and mathematical formulations but does not include a clearly labeled pseudocode block or algorithm section with structured steps. |
| Open Source Code | Yes | The underlying code and data are openly available from Lancaster University data repository at http://dx.doi.org/10.17635/lancaster/researchdata/97. |
| Open Datasets | Yes | Details of benchmark data sets: size (n), dimensionality (d), number of clusters (c). a. UCI machine learning repository https://archive.ics.uci.edu/ml/datasets.html |
| Dataset Splits | Yes | For each value of ℓ, 30 random partitions into labelled and unlabelled data are considered. As classes are balanced in the data sets considered, performance is measured only in terms of classification error on the unlabelled data. For data sets with more than two classes all pairwise combinations of classes are considered and aggregate performance is reported. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory details) for running its experiments. |
| Software Dependencies | No | The paper discusses various algorithms and methods (e.g., k-means++, LDA-km, i SVR-L and i SVR-G, Normalised cut spectral clustering (SCn), Laplacian Regularised Support Vector Machines (Lap SVM), Simple Semi-Supervised Learning (SSSL), Correlated Nyström Views (XNV)), but does not provide specific version numbers for any software libraries, programming languages, or tools used in the implementation. |
| Experiment Setup | Yes | In all experiments we set the bandwidth parameter to h = 0.9ˆσpc1n 1/5, where ˆσpc1 is the estimated standard deviation of the data projected onto the first principal component. This bandwidth selection rule is recommended when the density being approximated is assumed to be multimodal (Silverman, 1986). The parameter η controls the distance between the minimisers of arg minb R f CL(v, b) and arg minb F(v) ˆI(v, b), while larger values of ϵ increase the smoothness of the penalised function f CL. Values of η close to zero affect the numerical stability of the one-dimensional optimisation problem, due to the term L ηϵ in f CL becoming very large. We used η = 10 2 and ϵ = 1 10 6 to avoid numerical instability. The penalty parameter γ is first set to 0.1 and with this setting α is progressively increased in the same way as for clustering. After this, α is kept at αmax and γ is increased to 1 and then 10. |