On Binary Embedding using Circulant Matrices

Authors: Felix X. Yu, Aditya Bhaskara, Sanjiv Kumar, Yunchao Gong, Shih-Fu Chang

JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally in Section 7, we study the empirical performance of circulant embeddings via extensive experimentation. To compare the performance of the circulant binary embedding techniques, we conduct experiments on three real-world high-dimensional data sets used by the current state-of-the-art method for generating long binary codes (Gong et al., 2013).
Researcher Affiliation Collaboration 1Google Research, New York, NY 10011 2University of Utah, Salt Lake City, UT 84112 3Snap, Inc., Venice, CA 90291 4Columbia University, New York, NY 10027
Pseudocode No The paper describes methods and optimization procedures mathematically and in paragraph form but does not contain explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes The Image Net-51200 contains 100k images sampled from 100 random classes from Image Net (Deng et al., 2009), each represented by a 51,200 dimensional VLAD vector generated by using 400 cluster centers.
Dataset Splits Yes Following (Gong et al., 2013; Norouzi and Fleet, 2012; Gordo and Perronnin, 2011), we use 10,000 randomly sampled instances for training. We then randomly sample 500 instances, different from the training set as queries.
Hardware Specification Yes The time is based on a single 2.9GHz CPU core. In this paper, for fair comparison, we use same CPU based implementation for all the methods.
Software Dependencies No The paper mentions using Fast Fourier Transform algorithms and FFT libraries, and linear SVM, but does not provide specific software names with version numbers.
Experiment Setup Yes The proposed CBE method is found robust to the choice of λ in (31). For example, in the retrieval experiments, the performance difference for λ = 0.1, 1, 10, is within 0.5%. Therefore, in all the experiments, we simply fix λ = 1.