Learned-Database Systems Security
Authors: Roei Schuster, Jin Peng Zhou, Thorsten Eisenhofer, Paul Grubbs, Nicolas Papernot
TMLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To empirically validate the vulnerabilities surfaced by our framework, we choose 3 of them and implement and evaluate exploits against these. |
| Researcher Affiliation | Collaboration | Roei Schuster EMAIL Context AI Jin Peng Zhou EMAIL Cornell University Department of Computer Science Thorsten Eisenhofer EMAIL BIFOLD & TU Berlin Paul Grubbs EMAIL University of Michigan Nicolas Papernot EMAIL University of Toronto & Vector Institute |
| Pseudocode | No | The paper describes methods and processes in narrative text and conceptual diagrams, such as 'Attack method' in Section 4 and 'Insertion internals' in Section 5, but does not present any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement by the authors about releasing their own source code for the methodology described, nor does it provide a direct link to a code repository. While it mentions using 'PGM s open-source implementation' in Section 6, this refers to a third-party tool, not the authors' implementation of their exploits. |
| Open Datasets | Yes | We use the Cardinality Estimation Benchmark (CEB) Negi et al. (2021b)..., We evaluate the attack on two datasets from the original publication Ding et al. (2020a). First, the YCSB dataset, containing user IDs from the YCSB benchmark Cooper et al. (2010), and second the longitudes dataset with longitudes of locations around the world from Open Street Maps. |
| Dataset Splits | Yes | We divide the queries into two disjoint 20-query sets, Set A and Set B. For each set, we measure each query s latency in two scenarios, member and nonmember..., We assume BAO trains using N = 100 queries. We select 167 target queries from CEB... We construct a benign training set by randomly sampling 100 queries (benign queries) from CEB, and an adversarial training set by replacing 1 randomly-chosen benign query with one of our target queries., We add 100 attacker keys to each dataset, uniformly from µ1 2σ1, µ1 + 2σ1 (within the approximate range of keys in both A and B). |
| Hardware Specification | Yes | We run ALEX on a server with two Intel Xeon E5-2686 CPUs and 512GB of memory. ... we additionally run the attack on a machine with an Intel i7-7600U CPU and 16GB of memory., We run our experiments on a machine with a Intel i9-9940X CPU and 128 GB of memory. |
| Software Dependencies | No | The paper mentions software like 'Postgres' (Section 4), 'PGM s open-source implementation' (Section 6), and a 'B-tree CPP implementation B-tree (2011)' (Section 6), but it does not provide specific version numbers for these or any other ancillary software dependencies used in their experimental setup, which is required for reproducibility. |
| Experiment Setup | Yes | We assume BAO trains using N = 100 queries., for ALEX s default parameterization, 16MB, optimal piecewise-linear function which approximates, up to an additive error parameter ϵ, We repeat this 50 times for multiple values of ϵ |