DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation

Authors: Tom Segal, Asaf Shabtai, Yuval Elovici

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A detailed mathematical analysis and extensive evaluation show that DOMBA safeguards restricted information while offering utility comparable to non-secure models. ... We evaluate DOMBA’s performance on two access-controlled datasets, mimicking real world organizations needs. Our evaluation showed that DOMBA achieves a better security-utility trade-off than existing methods, across both datasets, two utility metrics and four security metrics.
Researcher Affiliation Academia Tom Segal, Asaf Shabtai, Yuval Elovici Ben-Gurion University of the Negev EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology in narrative text with definitions and theorems, but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes Code and datasets https://github.com/ppo1/DOMBA
Open Datasets Yes The first dataset we utilized is the IMDB Spoiler reviews dataset (Misra 2019). ... The second dataset used is the Food.com Recipes and Interactions dataset (Li 2019). We utilized class labels of the Food-101 dataset (Bossard, Guillaumin, and Van Gool 2014). Code and datasets https://github.com/ppo1/DOMBA
Dataset Splits Yes The number of reviews totaled 22,742, with 10% of each movie’s reviews set aside for evaluation. ... The number of recipes totaled 10,829, with 10% of each class put aside for evaluation.
Hardware Specification Yes All experiments were conducted on an NVIDIA A100-SXM4-40GB GPU.
Software Dependencies No The paper mentions using LORA (Hu et al. 2021) and Open AIGPT (Radford et al. 2018), and provides hyperparameters for LORA. However, it does not specify version numbers for LORA, Open AIGPT, or any other core software libraries or programming languages used (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes Regarding training parameters, we conducted experiments with varying numbers of training epochs (1, 2, and 4). The hyperparameters for LORA were set to default values and were not explored: r=64, lora alpha=32, lora dropout=0.05, optimizer=paged adamw 32bit, learning rate=5e-4, and warmup ratio=0.03.