TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization
Authors: Haowen Ma, Zhiguo Long, Hua Meng
ICML 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on several synthetic and extensive real-world datasets demonstrate the effectiveness and superiority of TANGO. |
| Researcher Affiliation | Academia | 1School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China 2School of Mathematics, Southwest Jiaotong University, Chengdu, China. Correspondence to: Zhiguo Long <EMAIL>, Hua Meng <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Quick Shift Algorithm 2 Typicality-Aware Mode-Seeking Algorithm 3 TANGO |
| Open Source Code | Yes | The code is available at https:// github.com/SWJTU-ML/TANGO_code. |
| Open Datasets | Yes | Experiments were conducted on 14 UCI datasets and 2 image datasets (Table 1). All datasets were min-max normalized. For image datasets MNIST and Umist, we used Auto Encoder (AE) to reconstruct them into 64-dimensional representations. Fig. 5 presents the results of TANGO and 10 comparison algorithms on 16 real-world datasets. We also apply it to image segmentation using Berkeley Segmentation Dataset Benchmark and provide corresponding running times. |
| Dataset Splits | No | The paper does not explicitly provide training/test/validation dataset splits. For clustering tasks, it mentions tuning hyperparameters to achieve a reasonable number of clusters while maximizing ARI and using the ground-truth number of clusters for evaluation, implying the full datasets are used for clustering and evaluation metrics. |
| Hardware Specification | Yes | The experimental environment is: Windows 11, Python 3.11, CPU i7-13700KF and 32GB RAM. |
| Software Dependencies | Yes | The experimental environment is: Windows 11, Python 3.11, CPU i7-13700KF and 32GB RAM. |
| Experiment Setup | Yes | For TANGO, the neighborhood size k was searched from 2 to 100 with a step size of 1. Algorithms requiring specification of the number of clusters used the ground-truth number of clusters. The detailed hyperparameter settings of the other algorithms are given in Appendix C.1. C.1. Hyperpamameter Settings of the Algorithms: The neighborhood size k was searched from 2 to 100 with a step size of 1 in TANGO, LDP-SC, DEMOS, QKSPP, DPCDBFN, and KNN-SC. Noise ratio parameters in DCDP-ASC and NDP-Kmeans were searched from 0 to 0.2 with a step size of 0.001. The minimum proportion of points that each cluster must contain in LDP-MST was fixed at 0.018 as described in the article. DCDP-ASC and DEMOS required manual selection of core points on decision graphs, and we used the best-performing selection for each parameter setting. In QKSPP, density fluctuation parameter ̸ was searched from 0 to 1 with a step size of 0.1 for each neighborhood size k. CPF searched for k and ̸ parameters among preset combinations provided in the source code. USPEC searched for the number of representative points with a step size of 100 from 100 to 1000 (or the nearest multiple of 100 below the maximum number of data points), and k was searched for all possible values under each representative point count. |