reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Pacmann: Efficient Private Approximate Nearest Neighbor Search

Authors: Mingxun Zhou, Elaine Shi, Giulia Fanti

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate the performance of PACMANN in terms of search quality, query latency, communication, and storage costs. Our results show that PACMANN achieves significantly better search quality than the clustering-based private ANN search algorithm used by state-of-the-art Tiptoe (Henzinger et al., 2023) and Wally (Asi et al., 2024). For example, our implementation finds the most relevant result in around 63% of the queries on the MSMARCO dataset, compared to 29% for clustering-based algorithms a 2.1x improvement in search success rate.
Researcher Affiliation	Academia	Mingxun Zhou, Elaine Shi & Giulia Fanti Carnegie Mellon University Pittsburgh, PA 15213, USA EMAIL
Pseudocode	Yes	Figure 4: Description of the graph based ANN search algorithm including the optional optimizations. The pseudocode can be read in the following two ways: 1) the non-highlighted part describes the generic graph-based ANN search algorithm; 2) the full description, including the optimizations we introduced in Section 4.3, describes our customized version of the algorithm. Figure 7: Description of our graph building algorithm ANN.Build Graph.
Open Source Code	Yes	Our open source implementation can be found at the anonymous repository on Github: https: //github.com/wuwuz/pacmann. The repository contains the following components: Instructions to install the required dependencies; Our own implementation for PACMANN; All baseline implementations including NGT, clustering-based ANN and also the plaintext innerproduct search (referred as Linear algorithm in Section 5); A script to download the SIFT dataset; Customizable scripts to run the baselines and our implementation with different parameter settings.
Open Datasets	Yes	SIFT Dataset (J egou et al., 2011) We use the first 100 million 128-dimensional vectors from the SIFT dataset for our evaluation. MS-MARCO Dataset. (Nguyen et al., 2016) The dataset contains 3.2 million text documents with more than 5000 test queries such that the top 1 relevant document is provided.
Dataset Splits	Yes	SIFT Dataset (J egou et al., 2011) We use the first 100 million 128-dimensional vectors from the SIFT dataset for our evaluation. We vary the database size by picking the first 2 million to 100 million vectors. We test 1,000 top-10 queries in the query set for each configuration, and measure the average recall@10 (the top-10 ground truth nearest neighbors for each query are provided by the dataset). MS-MARCO Dataset. (Nguyen et al., 2016) The dataset contains 3.2 million text documents with more than 5000 test queries such that the top 1 relevant document is provided. ... We compute the average MRR@100 for the first 1000 queries.
Hardware Specification	Yes	The experiments are run on a single server with a 2.4GHz Intel Xeon E5-2680 CPU and 256 GB of RAM.
Software Dependencies	No	Our open-source implementation uses Python for data preprocessing and Golang for the core algorithm, including the graph ANN algorithm and the PIR scheme. Details of the implementation are provided in Appendix C. The experiments are run on a single server with a 2.4GHz Intel Xeon E5-2680 CPU and 256 GB of RAM. We evaluate the latency numbers on two simulated network settings, the local area network (LAN) setting with 5ms round-trip-time (RTT) and the wide area network (WAN) setting with 50ms RTT. Although our implementation supports multi-threading optimization, we only use a single thread for all the experiments for a fair comparison. The paper mentions software names (Python, Golang) but does not provide specific version numbers for these or any libraries.
Experiment Setup	Yes	We tune the parameters of our scheme to achieve 0.90 recall@10 for each data point; that is, 9 out of 10 of the search results are indeed ground truth top-10 nearest neighbors on average. The graph building part follows our description in Figure 7 and sets the outbound number C = 32. ... 1) Beam Search: The number of parallel paths in the beam search algorithm is increased from 1 to 3. ... 3) Batched PIR: We enable the batch-mode PIR for each round of the search. ... We choose B = 16 and Q = 32 in our evaluation.