reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation

Authors: Tom Segal, Asaf Shabtai, Yuval Elovici

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	A detailed mathematical analysis and extensive evaluation show that DOMBA safeguards restricted information while offering utility comparable to non-secure models. ... We evaluate DOMBA’s performance on two access-controlled datasets, mimicking real world organizations needs. Our evaluation showed that DOMBA achieves a better security-utility trade-off than existing methods, across both datasets, two utility metrics and four security metrics.
Researcher Affiliation	Academia	Tom Segal, Asaf Shabtai, Yuval Elovici Ben-Gurion University of the Negev EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology in narrative text with definitions and theorems, but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Code and datasets https://github.com/ppo1/DOMBA
Open Datasets	Yes	The first dataset we utilized is the IMDB Spoiler reviews dataset (Misra 2019). ... The second dataset used is the Food.com Recipes and Interactions dataset (Li 2019). We utilized class labels of the Food-101 dataset (Bossard, Guillaumin, and Van Gool 2014). Code and datasets https://github.com/ppo1/DOMBA
Dataset Splits	Yes	The number of reviews totaled 22,742, with 10% of each movie’s reviews set aside for evaluation. ... The number of recipes totaled 10,829, with 10% of each class put aside for evaluation.
Hardware Specification	Yes	All experiments were conducted on an NVIDIA A100-SXM4-40GB GPU.
Software Dependencies	No	The paper mentions using LORA (Hu et al. 2021) and Open AIGPT (Radford et al. 2018), and provides hyperparameters for LORA. However, it does not specify version numbers for LORA, Open AIGPT, or any other core software libraries or programming languages used (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup	Yes	Regarding training parameters, we conducted experiments with varying numbers of training epochs (1, 2, and 4). The hyperparameters for LORA were set to default values and were not explored: r=64, lora alpha=32, lora dropout=0.05, optimizer=paged adamw 32bit, learning rate=5e-4, and warmup ratio=0.03.