reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

k-HyperEdge Medoids for Clustering Ensemble

Authors: Feijiang Li, Jieting Wang, Liuya Zhang, Yuhua Qian, Shuai Jin, Tao Yan, Liang Du

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The CEHM is experimentally analyzed from three aspects: the demonstration of working mechanisms on artificial data, convergence demonstration on real data, and ensemble performance comparison with representative methods. The experimental analyses are conducted on a PCWIN 64 computer with 64G memory. The effectiveness and efficiency of the proposed method are also verified on these data, with nine representative clustering ensemble algorithms as reference. The ensemble performance results are shown in Table 2 and Table 4 (in Appendix A.8). The tables respectively show the average NMI and AIR of the 30 times for each method on each data set.
Researcher Affiliation	Academia	Feijiang Li1,2, Jieting Wang1,2, Liuya Zhang1, Yuhua Qian1,2*, Shuai Jin1, Tao Yan1,2, Liang Du1,2 1Institute of Big Data Science and Industry, Shanxi University 2Key Laboratory of Evolutionary Science Intelligence of Shanxi Province, Taiyuan, Shanxi, China EMAIL
Pseudocode	Yes	The processes of the k-Hyper Edge initialization step are shown as Algorithm 1 in Appendix A.1 2. The processes of the k-Hyper Edge diffusion step are shown as Algorithm 2 in Appendix A.1. Further, the quality of Ed is improved by the following k-Hyper Edge adjustion step. The processes of the k-Hyper Edge adjustion step are shown as Algorithm 3 in Appendix A.1. The final hyperedge set is the discovered k-Hyper Edge Medoids and noted as Em = {e1, e2, . . . , ek}. The final corresponding clustering ensemble result is π = {c1 = e1, c2 = e2, . . . , ck = ek}. (17) We note the above clustering ensemble method based on the construction of k-Hyper Edge Medoids as CEHM. The processes of CECH is shown as Algorithm 4 in Appendix A.1.
Open Source Code	Yes	1Demo code is at https://github.com/Feijiang Li/Code-k Hyper Edge-Medoids-for-Clustering-Ensemble-AAAI
Open Datasets	Yes	The convergence of CEHM is analyzed on twenty widely used benchmark data sets. The information about these data sets is shown in Table 1. ... Table 1: Description of the data sets [lists iris, wine, seeds, heart, soybean-train, ecoli, dermatology, low-res-spect, breast-cancer-wisc-diag, energy, lbp-riu-gris, semeion, statlog-landsat-test, cardiotocography-3clases, statlog-landsat-train, twonorm, mushroom, statlog-shuttle, Pen Digits, USPS]
Dataset Splits	No	For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k. For a data set with n samples and k clusters, the numbers of clusters in the base clustering results are randomly selected in the range k, max{min{ n, 50}, 3 2k } .
Hardware Specification	Yes	The experimental analyses are conducted on a PCWIN 64 computer with 64G memory.
Software Dependencies	No	The k-medoids algorithm is suitable for arbitrary distance measurement and robust to noise (Tiwari et al. 2020). After run k-medoids algorithm on E(Πc), k hyperedges are obtained, which is noted as Ei = {ei1, ei2, . . . eik}. ... For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k.
Experiment Setup	Yes	For a data set, the clustering result set is generated by running the k-means algorithm multiple times with different initial centers and different cluster numbers k. For a data set with n samples and k clusters, the numbers of clusters in the base clustering results are randomly selected in the range k, max{min{ n, 50}, 3 2k } .