reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering

Authors: Xuqian Xue, Yiming Lei, Qi Cai, Hongming Shan, Junping Zhang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate PROTOCOL, we establish a comprehensive benchmark on five widely-used multi-view datasets, as shown in Table 1. To simulate real-world imbalanced scenarios, we create imbalanced versions of these datasets with three imbalance ratios R {0.1, 0.5, 0.9}. The R are kept consistent across different views. We place implementation details of PROTOCOL in the Appendix B. We compare PROTOCOL with nine state-of-the-art methods... Experiments are conducted on five datasets under three imbalance ratios R {0.1, 0.5, 0.9}, as shown in Tables 2 to 4. Based on these results, we have the following observations. ... We conduct ablation studies on three datasets under R {0.1, 0.5} to validate the effectiveness of each component in PROTOCOL. ... We conduct convergence and parameter sensitivity analysis on the Caltech dataset, as shown in Fig. 4.
Researcher Affiliation	Academia	1Shanghai Key Laboratory of Intelligent Information Processing, College of Computer Science and Artificial Intelligence, Fudan University, China. 2College of Computer Science and Technology, Qingdao University, China. 3Shanghai Key Laboratory of Navigation and Location-based Services, School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, China. 4Institute of Science and Technology for Brain Inspired Intelligence and Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, China.
Pseudocode	Yes	Algorithm 1 Multi-view Imbalanced Learning Framework
Open Source Code	Yes	Our code is available at https://github. com/Scarlett125/PROTOCOL.
Open Datasets	Yes	To evaluate PROTOCOL, we establish a comprehensive benchmark on five widely-used multi-view datasets, as shown in Table 1. ... Hdigit (Chen et al., 2022) ... Fashion (Xiao et al., 2017) ... NUS-WIDE (Chua et al., 2009) ... Caltech (Fei-Fei et al., 2004) ... Cifar10 (Yan et al., 2023)
Dataset Splits	No	The paper mentions creating "imbalanced versions of these datasets with three imbalance ratios R {0.1, 0.5, 0.9}" and evaluating on "balanced test sets", implying a train/test split. However, it does not provide specific percentages, sample counts, or methodology for how the training and test sets are partitioned for the experiments across these datasets.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU/GPU models, memory, or other computational resources used for the experiments.
Software Dependencies	No	The paper mentions using the "Adam optimizer" but does not specify any programming languages, libraries, or frameworks with their respective version numbers.
Experiment Setup	Yes	The optimization is performed using Adam optimizer with learning rate 1e-3. The training process consists of three stages: view-specific feature learning with 200 epochs for reconstruction loss, consensus learning with 50-100 epochs for multi-view consistency, and class-rebalanced enhancement with 50-100 epochs for imbalance learning. We set the batch size to 256. ... we set τf = 0.5 and τl = 1.0. The Appendix Fig. 7 shows similar stability for the semantic consistency parameter a and the base learning weight λbase, and we set a = 0.5, λbase = 0.1.