reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Binary Embedding using Circulant Matrices

Authors: Felix X. Yu, Aditya Bhaskara, Sanjiv Kumar, Yunchao Gong, Shih-Fu Chang

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally in Section 7, we study the empirical performance of circulant embeddings via extensive experimentation. To compare the performance of the circulant binary embedding techniques, we conduct experiments on three real-world high-dimensional data sets used by the current state-of-the-art method for generating long binary codes (Gong et al., 2013).
Researcher Affiliation	Collaboration	1Google Research, New York, NY 10011 2University of Utah, Salt Lake City, UT 84112 3Snap, Inc., Venice, CA 90291 4Columbia University, New York, NY 10027
Pseudocode	No	The paper describes methods and optimization procedures mathematically and in paragraph form but does not contain explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	The Image Net-51200 contains 100k images sampled from 100 random classes from Image Net (Deng et al., 2009), each represented by a 51,200 dimensional VLAD vector generated by using 400 cluster centers.
Dataset Splits	Yes	Following (Gong et al., 2013; Norouzi and Fleet, 2012; Gordo and Perronnin, 2011), we use 10,000 randomly sampled instances for training. We then randomly sample 500 instances, diﬀerent from the training set as queries.
Hardware Specification	Yes	The time is based on a single 2.9GHz CPU core. In this paper, for fair comparison, we use same CPU based implementation for all the methods.
Software Dependencies	No	The paper mentions using Fast Fourier Transform algorithms and FFT libraries, and linear SVM, but does not provide specific software names with version numbers.
Experiment Setup	Yes	The proposed CBE method is found robust to the choice of λ in (31). For example, in the retrieval experiments, the performance diﬀerence for λ = 0.1, 1, 10, is within 0.5%. Therefore, in all the experiments, we simply ﬁx λ = 1.