Partial Label Clustering
Authors: Yutong Xie, Fuchao Yang, Yuheng Jia
IJCAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments demonstrate our method realizes superior performance when comparing with state-of-the-art constrained clustering methods, and outperforms PLL and semi-supervised PLL methods when only limited samples are annotated. The code and appendix are publicly available at https://github.com/xyt-ml/PLC. 5 Experiments 5.1 Experimental Setup 5.2 Experimental Results 5.3 Further Analysis |
| Researcher Affiliation | Academia | Yutong Xie1, Fuchao Yang2, Yuheng Jia 3,4 1 Chien-Shiung Wu College, Southeast University, Nanjing 210096, China 2 College of Software Engineering, Southeast University, Nanjing 210096, China 3 School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 4 Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of PLC |
| Open Source Code | Yes | The code and appendix are publicly available at https://github.com/xyt-ml/PLC. |
| Open Datasets | Yes | To conduct a comprehensive evaluation of our proposed method, we compare our PLC method with other methods on both controlled UCI datasets and real-world datasets. The characteristics of controlled UCI datasets and real-world datasets can be found in Appendix D. Following the widelyused partial label data generation protocol [Cour et al., 2011], we generate the artificial partial label datasets under the controlling parameter r which controls the number of false-positive labels. For each example, we randomly select r other labels as false-positive labels. Table 1: Experimental results on ACC when compared with constrained clustering methods under different proportions of partial label training examples on real-world datasets, where bold and underlined indicate the best and second best results respectively. Compared Method Lost MSRCv2 Mirflickr Bird Song. |
| Dataset Splits | Yes | For constrained clustering methods, we randomly sample the partial label examples based on the proportion ρ {0.05, 0.10, 0.15, 0.20, 0.30, 0.40} and the remaining samples are used as test data. For PLL and semi-supervised PLL methods, we randomly sample the partial label examples based on the proportion ρ {0.01, 0.02, 0.05, 0.10}. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Parameters for our PLC method are set as α, β {0.01, 0.1, 1}, γ = 10 and k {10, 15, 20, 25, 30, 40}. Each compared method is implemented with the default hyper-parameter setup suggested in the respective literature. For each experiment, we implemented 10 times with random partitions and reported the average performance with the standard deviation. |