reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization

Authors: Haowen Ma, Zhiguo Long, Hua Meng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on several synthetic and extensive real-world datasets demonstrate the effectiveness and superiority of TANGO.
Researcher Affiliation	Academia	1School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu, China 2School of Mathematics, Southwest Jiaotong University, Chengdu, China. Correspondence to: Zhiguo Long <EMAIL>, Hua Meng <EMAIL>.
Pseudocode	Yes	Algorithm 1 Quick Shift Algorithm 2 Typicality-Aware Mode-Seeking Algorithm 3 TANGO
Open Source Code	Yes	The code is available at https:// github.com/SWJTU-ML/TANGO_code.
Open Datasets	Yes	Experiments were conducted on 14 UCI datasets and 2 image datasets (Table 1). All datasets were min-max normalized. For image datasets MNIST and Umist, we used Auto Encoder (AE) to reconstruct them into 64-dimensional representations. Fig. 5 presents the results of TANGO and 10 comparison algorithms on 16 real-world datasets. We also apply it to image segmentation using Berkeley Segmentation Dataset Benchmark and provide corresponding running times.
Dataset Splits	No	The paper does not explicitly provide training/test/validation dataset splits. For clustering tasks, it mentions tuning hyperparameters to achieve a reasonable number of clusters while maximizing ARI and using the ground-truth number of clusters for evaluation, implying the full datasets are used for clustering and evaluation metrics.
Hardware Specification	Yes	The experimental environment is: Windows 11, Python 3.11, CPU i7-13700KF and 32GB RAM.
Software Dependencies	Yes	The experimental environment is: Windows 11, Python 3.11, CPU i7-13700KF and 32GB RAM.
Experiment Setup	Yes	For TANGO, the neighborhood size k was searched from 2 to 100 with a step size of 1. Algorithms requiring specification of the number of clusters used the ground-truth number of clusters. The detailed hyperparameter settings of the other algorithms are given in Appendix C.1. C.1. Hyperpamameter Settings of the Algorithms: The neighborhood size k was searched from 2 to 100 with a step size of 1 in TANGO, LDP-SC, DEMOS, QKSPP, DPCDBFN, and KNN-SC. Noise ratio parameters in DCDP-ASC and NDP-Kmeans were searched from 0 to 0.2 with a step size of 0.001. The minimum proportion of points that each cluster must contain in LDP-MST was fixed at 0.018 as described in the article. DCDP-ASC and DEMOS required manual selection of core points on decision graphs, and we used the best-performing selection for each parameter setting. In QKSPP, density fluctuation parameter ̸ was searched from 0 to 1 with a step size of 0.1 for each neighborhood size k. CPF searched for k and ̸ parameters among preset combinations provided in the source code. USPEC searched for the number of representative points with a step size of 100 from 100 to 1000 (or the nearest multiple of 100 below the maximum number of data points), and k was searched for all possible values under each representative point count.