reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning from Sample Stability for Deep Clustering

Authors: Zhixin Li, Yuheng Jia, Hui Liu, Junhui Hou

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments across benchmark datasets showcase that incorporating sample stability into training can improve the performance of deep clustering. The code is available at https://github.com/LZX-001/LFSS.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China 3School of Computing Information Sciences, Saint Francis University, Hong Kong, China 4Department of Computer Science, City University of Hong Kong, Hong Kong, China. Correspondence to: Yuheng Jia <EMAIL>.
Pseudocode	Yes	Algorithm 1 Proposed LFSS
Open Source Code	Yes	The code is available at https://github.com/LZX-001/LFSS.
Open Datasets	Yes	We conduct experiments on multiple commonly used datasets, including CIFAR-10 (Krizhevsky, 2009), CIFAR20 (Krizhevsky, 2009), STL-10 (Coates et al., 2011), Image Net-10 (Chang et al., 2017), Image Net-Dogs (Chang et al., 2017), Tiny-Image Net (Le & Yang, 2015) and Image Net-1K (Deng et al., 2009).
Dataset Splits	No	The paper lists standard benchmark datasets such as CIFAR-10, CIFAR-20, STL-10, Image Net-10, Image Net-Dogs, Tiny-Image Net, and Image Net-1K, and mentions 'We train models on STL-10 with extra unlabeled data.' However, it does not explicitly describe the specific training/test/validation splits used for each dataset or how they were applied in the experimental setup for the deep learning model training phase.
Hardware Specification	Yes	All experiments are conducted based on Py Torch and all models are trained on an NVIDIA RTX 4090 GPU.
Software Dependencies	No	The paper states 'All experiments are conducted based on Py Torch' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We adopt Res Net-18 as the backbone unless specifically specified. We train the models for 1,000 epochs with a batch size of 256, unless noted otherwise. We adopt the stochastic gradient descent (SGD) optimizer and the cosine decay learning rate schedule to effectively train our model. Besides, we adopt data augmentation methods following (Chen et al., 2020). We empirically set the trade-off hyper-parameter λ in Eq. (10) to 0.1 for all experiments unless otherwise specified. We set the unstable ratio δ to 0.1 for all experiments to exclude the most unstable samples in cluster-level loss for LFSS, as indicated by the results in Observation 1. We set the warmup epoch number η to 200 for CIFAR-10 and Image Net-10, 500 for CIFAR-20, STL-10 and Image Net Dogs. The noise intensity in Eq. (9) σ is set to 0.01 for STL10 and Image Net-10, 0.001 for CIFAR-10, CIFAR-20 and Image Net-Dogs.