reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Correlation Clustering with Active Learning of Pairwise Similarities

Authors: Linus Aronsson, Morteza Haghir Chehreghani

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our framework and the proposed query strategies via several experimental studies. ... In this section, we describe our experimental studies, where additional results are presented in Appendix C. ... Figures 1 and 2 illustrate the results for different real-world datasets with a random initialization of σ0 and noise levels γ = 0.2 and γ = 0.4, respectively. We observe that with even a fairly small amount of noise, all the baselines (n COBRAS, COBRAS and QECC) perform very poorly.
Researcher Affiliation	Academia	Linus Aronsson EMAIL Chalmers University of Technology Morteza Haghir Chehreghani EMAIL Chalmers University of Technology
Pseudocode	Yes	Algorithm 1 Active clustering procedure ... Algorithm 2 Max Correlation Clustering Algorithm A (dynamic k)
Open Source Code	No	Implementations of COBRAS and n COBRAS are publicly available and are thus used in our experiments.5 Finally, we note that there exist other active semi-supervised clustering methods developed in the constraint clustering setting such as NPU (Xiong et al., 2014) used as a baseline in (Soenen et al., 2021). ... Link to the open-source implementations of COBRAS and n COBRAS: https://github.com/jonassoenen/noise_robust_cobras
Open Datasets	Yes	2. 20newsgroups: consists of 18846 newsgroups posts (in the form of text) on 20 topics (clusters). ... 3. CIFAR10: consists of 60000 32 32 color images in 10 classes, with 6000 images per class. ... 4. MNIST: consist of 60000 28 28 grayscale images of handwritten digits.
Dataset Splits	No	For a dataset of N objects, the number of pairwise similarities is \|E\| = (N (N 1))/2, which implies the huge querying space that active learning needs to deal with. We use a batch size of B = \|E\|/1000 for all datasets, unless otherwise specified. ... For random initialization, we randomly assign each of the objects to one of the ten different clusters resulting in a clustering C. Then, for each (u, v) E, the initial similarity σ0(u, v) is set to +0.1 if u and v are in the same cluster according to C, and 0.1 otherwise.
Hardware Specification	No	The computations and data handling were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) and the Swedish National Infrastructure for Computing (SNIC) at Chalmers Centre for Computational Science and Engineering (C3SE), High Performance Computing Center North (HPC2N) and Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) partially funded by the Swedish Research Council through grant agreements no. 2022-06725 and no. 2018-05973.
Software Dependencies	No	We use the distilbert-base-uncased transformer model loaded from the Flair Python library (Akbik et al., 2018) in order to embed each of the 1000 documents (data points) into a 768-dimensional latent space, in which k-means is performed. ... We use a Res Net18 model (He et al., 2015) trained on the full CIFAR10 dataset in order to embed the 1000 images into a 512-dimensional space, in which k-means is performed. ... We use a simple CNN model trained on the MNIST dataset in order to embed the 1000 images into a 128-dimensional space, in which k-means is performed.
Experiment Setup	Yes	Initial pairwise similarities. For each experiment, we are given a dataset with ground-truth labels, where the ground-truth labels are only used for evaluations. Then, for each (u, v) E in a dataset, we set σ (u, v) to +1 if u and v belong to the same class, and 1 otherwise. ... For random initialization, we randomly assign each of the objects to one of the ten different clusters resulting in a clustering C. Then, for each (u, v) E, the initial similarity σ0(u, v) is set to +0.1 if u and v are in the same cluster according to C, and 0.1 otherwise. ... Query strategies. We consider five different query strategies: uniform, uncertainty (Eq. 8), frequency (Eq. 9), maxmin (Eq. 10) and maxexp (Eq. 13). We set ϵ = 0.3, τ = 5 and β = 1 for all experiments unless otherwise specified. ... We use a batch size of B = \|E\|/1000 for all datasets, unless otherwise specified. ... In our experiments, we set T = 3, η = 2 52 (double precision machine epsilon) and k = \|z\| in the first iteration of Algorithm 1, and then k = \|Ci\| for all remaining iterations where \|Ci\| denotes the number of clusters in the current clustering Ci (in Algorithm 1).