k-HyperEdge Medoids for Clustering Ensemble

Authors: Feijiang Li, Jieting Wang, Liuya Zhang, Yuhua Qian, Shuai Jin, Tao Yan, Liang Du

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The CEHM is experimentally analyzed from three aspects: the demonstration of working mechanisms on artificial data, convergence demonstration on real data, and ensemble performance comparison with representative methods. The experimental analyses are conducted on a PCWIN 64 computer with 64G memory. The effectiveness and efficiency of the proposed method are also verified on these data, with nine representative clustering ensemble algorithms as reference. The ensemble performance results are shown in Table 2 and Table 4 (in Appendix A.8). The tables respectively show the average NMI and AIR of the 30 times for each method on each data set.
Researcher Affiliation Academia Feijiang Li1,2, Jieting Wang1,2, Liuya Zhang1, Yuhua Qian1,2*, Shuai Jin1, Tao Yan1,2, Liang Du1,2 1Institute of Big Data Science and Industry, Shanxi University 2Key Laboratory of Evolutionary Science Intelligence of Shanxi Province, Taiyuan, Shanxi, China EMAIL
Pseudocode Yes The processes of the k-Hyper Edge initialization step are shown as Algorithm 1 in Appendix A.1 2. The processes of the k-Hyper Edge diffusion step are shown as Algorithm 2 in Appendix A.1. Further, the quality of Ed is improved by the following k-Hyper Edge adjustion step. The processes of the k-Hyper Edge adjustion step are shown as Algorithm 3 in Appendix A.1. The final hyperedge set is the discovered k-Hyper Edge Medoids and noted as Em = {e1, e2, . . . , ek}. The final corresponding clustering ensemble result is π = {c1 = e1, c2 = e2, . . . , ck = ek}. (17) We note the above clustering ensemble method based on the construction of k-Hyper Edge Medoids as CEHM. The processes of CECH is shown as Algorithm 4 in Appendix A.1.
Open Source Code Yes 1Demo code is at https://github.com/Feijiang Li/Code-k Hyper Edge-Medoids-for-Clustering-Ensemble-AAAI
Open Datasets Yes The convergence of CEHM is analyzed on twenty widely used benchmark data sets. The information about these data sets is shown in Table 1. ... Table 1: Description of the data sets [lists iris, wine, seeds, heart, soybean-train, ecoli, dermatology, low-res-spect, breast-cancer-wisc-diag, energy, lbp-riu-gris, semeion, statlog-landsat-test, cardiotocography-3clases, statlog-landsat-train, twonorm, mushroom, statlog-shuttle, Pen Digits, USPS]
Dataset Splits No For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k. For a data set with n samples and k clusters, the numbers of clusters in the base clustering results are randomly selected in the range k, max{min{ n, 50}, 3 2k } .
Hardware Specification Yes The experimental analyses are conducted on a PCWIN 64 computer with 64G memory.
Software Dependencies No The k-medoids algorithm is suitable for arbitrary distance measurement and robust to noise (Tiwari et al. 2020). After run k-medoids algorithm on E(Πc), k hyperedges are obtained, which is noted as Ei = {ei1, ei2, . . . eik}. ... For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k.
Experiment Setup Yes For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k. For a data set with n samples and k clusters, the numbers of clusters in the base clustering results are randomly selected in the range k, max{min{ n, 50}, 3 2k } .