The Sparse Matrix-Based Random Projection: A Study of Binary and Ternary Quantization

Authors: Weizhi Lu, Zhongzheng Li, Mingrui Chen, Weiyu Li

TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This is validated through classification and clustering experiments, where extremely sparse binary matrices, with only one nonzero entry per column, achieve superior or comparable performance to other denser binary matrices and Gaussian matrices.
Researcher Affiliation Academia Weizhi Lu EMAIL School of Control Science and Engineering, Shandong University Key Laboratory of Machine Intelligence and System Control, Ministry of Education Zhongzheng Li EMAIL School of Control Science and Engineering, Shandong University Mingrui Chen EMAIL School of Control Science and Engineering, Shandong University Weiyu Li EMAIL Zhongtai Securities Institute for Financial Studies, Shandong University National Center for Applied Mathematics in Shandong
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides mathematical formulations and derivations, but no structured algorithm steps.
Open Source Code No The paper does not contain any explicit statements about providing source code, nor does it provide any links to a code repository.
Open Datasets Yes The sparse data intended for projection are generated from the datasets Yale B (Georghiades et al., 2001; Lee et al., 2005), CIFAR10 (Krizhevsky & Hinton, 2009) and Mini-Image Net (Vinyals et al., 2016), respectively via the feature transforms DWT (Mallat, 2009), Alex Net Conv5 (Krizhevsky et al., 2012) and VGG16 Conv5_3 (Simonyan & Zisserman, 2014).
Dataset Splits Yes From the dataset, we randomly select 9/10 samples for training and the rest for testing. CIFAR10 consists of 10 classes of color images, with 6000 samples per class. Mini-Image Net is a subset of Image Net (Deng et al., 2009), which consists of 100 classes of color images, each class having 600 samples. For the latter two datasets, we use their default training and testing samples, with the ratio of 5/1.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments (e.g., GPU/CPU models, memory details).
Software Dependencies No The paper mentions using KNN, SVM with a linear kernel, and k-means algorithms but does not provide specific version numbers for any software libraries or frameworks used for their implementation.
Experiment Setup No The paper specifies feature sparsity ratios (k/n = 1%, 5%, 10%, 20%) and projection ratios (m/n = 10%, 50%), and notes using KNN and SVM for classification and k-means for clustering based on cosine distance. However, it lacks crucial hyperparameters for these algorithms, such as the 'K' value for KNN, the 'C' parameter for SVM, or the number of clusters ('k') for k-means, which are essential for reproducing the experimental setup.