reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Sketch and shift: a robust decoder for compressive clustering

Authors: Ayoub Belhadji, Rémi Gribonval

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we undertake a scrutinized examination of CL-OMPR to circumvent its limitations. In particular, we show how this algorithm can fail to recover the clusters even in advantageous scenarios. ... To address these limitations, we propose an alternative decoder offering substantial improvements over CL-OMPR. ... The proposed algorithm can extract clustering information from a sketch of the MNIST (resp. of the CIFAR10) dataset that is 10 times smaller than previously and much easier to tune. ... Section 5: Numerical simulations
Researcher Affiliation	Academia	Ayoub Belhadji EMAIL Univ Lyon, ENS de Lyon, Inria, CNRS, UCBL, LIP UMR 5668, Lyon, France Rémi Gribonval EMAIL Univ Lyon, ENS de Lyon, Inria, CNRS, UCBL, LIP UMR 5668, Lyon, France
Pseudocode	Yes	Algorithm 1: CL-OMPR Algorithm 2: Proposed decoder
Open Source Code	No	The paper mentions: "The dataset and the code to generate a similar one can be downloaded from https://openreview.net/forum?id=6rWuWbVmgz." This statement refers to code for generating a synthetic dataset, not explicitly the source code for the proposed decoder or CL-OMPR methodology. The primary algorithms of the paper are not stated to have their source code released.
Open Datasets	Yes	The proposed algorithm can extract clustering information from a sketch of the MNIST (resp. of the CIFAR10) dataset... In this section, we investigate whether the observations of Section 5.2 hold for real datasets. For this purpose, we perform experiments on spectral features of the MNIST dataset... The resulting matrix can be downloaded from https://gitlab.com/dzla/Spectral MNIST. Appendix C.2 The case of CIFAR-10... we perform experiments on the training set of the CIFAR-10 dataset
Dataset Splits	No	The paper describes using the full MNIST dataset (N=70000) and the training set of the CIFAR-10 dataset (N=60000) for clustering experiments. It does not specify any training/validation/test splits, which are typically used for supervised learning tasks, as clustering is an unsupervised task.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments. It mentions previous work was done "on a single laptop" and thanks "the Blaise Pascal Center (CBP) for the computational means", but these are not specific hardware specifications for the reported experiments.
Software Dependencies	No	The paper mentions the use of the "Python package Pycle2" without specifying a version number. It also references other tools like ResNet18 (model architecture), SGD (optimizer), Vlfeat, and SIDUS in the bibliography, but does not list specific software dependencies with version numbers for its own implementation.
Experiment Setup	Yes	We consider a dataset X = {x1, . . . , x N} R2, where N = 100000 and the xi are i.i.d. draws from a mixture of isotropic Gaussians Pk i=1 αi N(ci, Σi), where k = 3, α1 = α2 = α3 = 1/3, c1, c2, c3 R2 and Σ1 = = Σk = σ2 X I2 R2 2. ... for sketch sizes m {30, 1000}, averaged over 50 realizations of the sketching operator, as a function of the bandwidth σ. ... For all algorithms we use as a domain Θ the hypercube Θ = [ 1, 1]6. ... For the three compressive algorithms, we take T = 2k. Moreover, for the two variants of Algorithm 2, we take L = 10000 random initializations... Appendix C.2: The network is trained on the training set of CIFAR-10 for 50 epochs with SGD with momentum 0.9, learning rate 0.1, learning rate decay 0.1, batch-size 512 and weight-decay 5e-4.