reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FORTRESS: Fast, Tuning-Free Retrieval Ensemble for Scalable LLM Safety

Authors: Chi-Wei Chang, Richard Tzong-Han Tsai

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluation across nine safety benchmarks demonstrates that FORTRESS achieves state-of-the-art performance with an F1 score of 91.6%, while operating over five times faster than leading fine-tuned classifiers.
Researcher Affiliation	Academia	Chi-Wei Chang EMAIL Center of GIS, Academia Sinica Chingshin Academy, Taiwan; Richard Tzong-Han Tsai EMAIL Center of GIS, Academia Sinica National Central University, Taiwan
Pseudocode	Yes	Algorithm 1 FORTRESS Ensemble Strategy Input: Primary results Dprimary, Perplexity result Rperp Parameters: Tratio, (W def p , W def s ), (W mix p , W mix s ) Output: Classification {SAFE, UNSAFE}
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing its own code or a link to a code repository for the methodology described.
Open Datasets	Yes	Table 1: Source datasets used for knowledge base creation and evaluation. The names used here are the full, formal names of the benchmarks, which are abbreviated in some tables in the main text for brevity. Availability column shows direct links like nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Dataset Splits	Yes	The experiment was performed using the FORTRESS Gemma 1B configuration with its default knowledge base... We employed a 5-fold cross-validation protocol over the entire FORTRESS dataset.
Hardware Specification	Yes	Table 11: Computing infrastructure used for all experiments. CPU AMD RYZEN 9 7900 12-Core Processor GPU 1x NVIDIA RTX 3090 GPU Memory 24 GB GDDR6 System Memory 64 GB DDR5
Software Dependencies	Yes	Table 11: Computing infrastructure used for all experiments. Python 3.12 Py Torch 2.7.0 Transformers 4.51.3 FAISS 1.8.0 ( faiss-gpu ) Chroma DB 1.0.9 scikit-learn 1.6.1 scikit-optimize 0.10.2 Num Py 2.2.6 Pandas 2.2.3
Experiment Setup	Yes	Table 9 lists the final hyperparameters used for all experiments reported in the main paper. These values were selected based on preliminary experiments and sensitivity analyses discussed in the main text.