Enhancing Contrastive Clustering with Negative Pair-guided Regularization

Authors: Abhishek Kumar, Anish Chakrabarty, Sankha Subhra Mullick, Swagatam Das

TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental NRCC s superiority is demonstrated across various datasets with different scales and cluster structures, outperforming 20 state-of-the-art methods. ... Empirical evaluation of the efficacy of NRCC with Grid Shift (Kumar et al., 2022) against the state-of-the-arts in Section 4, shows a 4.7% improvement in clustering accuracy by BYOL+NRCC+UMAP+Grid Shift on an average over eight datasets.
Researcher Affiliation Collaboration Abhishek Kumar EMAIL ENET Centre, Centre for Energy and Environmental Technologies VSBTechnical University of Ostrava, Ostrava, Czech Republic. Anish Chakrabarty EMAIL Statistics and Mathematics Unit Indian Statistical Institute, Kolkata, India Sankha Subhra Mullick EMAIL Dolby Laboratories, India Swagatam Das EMAIL Electronics and Communication Sciences Unit Indian Statistical Institute, Kolkata, India
Pseudocode Yes Algorithm 1: Augmented view generation with SGHMC. Algorithm 2: The proposed Info NCE+NRCC. Algorithm 3: The proposed BYOL+NRCC.
Open Source Code Yes The code base is available at https://github.com/abhisheka456/NRCC.
Open Datasets Yes We consider four types of datasets, namely large-scale moderate resolution CIFAR-10, CIFAR-100 (Krizhevsky & Hinton, 2009), and STL-10 (Coates et al., 2011), moderate scale higher resolution subsets of Image Net such as Image Net-10 and Image Net-Dogs (Russakovsky et al., 2015), large-scale higher resolution Tiny Image Net (Le & Yang, 2015), and large scale moderate resolution long-tailed CIFAR-10-LT (Tang et al., 2020) and CIFAR-20-LT (Tang et al., 2020).
Dataset Splits No The paper does not explicitly provide specific training/test/validation dataset splits, percentages, or sample counts used for the experiments. It mentions using established datasets and refers to external protocols for ImageNet-1k but does not detail its own splitting methodology or specific splits for the reported results.
Hardware Specification Yes We use the same computing setup with four V100 GPUs while calculating the time in hours to ensure fairness.
Software Dependencies No The paper mentions using the Stochastic Gradient Descent (SGD) optimizer but does not specify any software libraries or frameworks with their version numbers (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes We have rigorously trained all models for 1,000 epochs, following the conventional recommendations (Tao et al., 2021; Tsai et al., 2021). We have utilized the Stochastic Gradient Descent (SGD) optimizer with a cosine learning rate scheduler, that includes a warm up for the initial 50 updates. For Mo Co (He et al., 2020), BYOL (Grill et al., 2020), and NRCC, we set the base learning rate to 0.05, dynamically scaling it with the batch size (β = 0.05 n/256). ... Finally, we set the temperature τ (searched between {0.1, 1, 10} and the regularization weight λ (varied between {0.1, 0.5, 1} to 0.1 each. In the case of SGHMC, we set δ1, δ2, δ3, and ζ as 0.1, 0.05, 0.99, and 1 ... Following conventional guidelines, the mini-batch size was 512 for Mo Co and 256 for the remaining models, including NRCC. ... train a Res Net-50 for 200 epochs.