KBLaM: Knowledge Base augmented Language Model

Authors: Xi Wang, Taketomo Isazawa, Liana Mikaelyan, James Hensman

ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments demonstrate KBLAM s effectiveness in various tasks, including question-answering and open-ended reasoning, while providing interpretable insights into its use of the augmented knowledge. In this section, we perform empirical evaluation for KBLAM.
Researcher Affiliation Collaboration Xi Wang Johns Hopkins University EMAIL Taketomo Isazawa* Microsoft Research EMAIL Liana Mikaelyan Microsoft EMAIL James Hensman Microsoft Research EMAIL
Pseudocode No The paper describes methods and processes verbally and with diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes Code and datasets are available at https://github.com/microsoft/KBLa M/
Open Datasets Yes Code and datasets are available at https://github.com/microsoft/KBLa M/ Lastly, we release our training and evaluation KBs, which can help future research in augmenting LLM with KB, as well as in other topics such as long-context language models, hallucination detection/reduction, and structured attention. Enron A KB constructed from the Enron (Klimt & Yang, 2004) dataset, an open-sourced corporate email dataset.
Dataset Splits Yes To construct each training sample, we perform the following procedure: We randomly select a subset of 10 to 100 triples from the synthetic KB to form a sample-specific KB. For evaluation, we considered the following two KB datasets3 Synthetic data The validation set of the synthetic KB, i.e. the 15000 triples not used for training. We consider a setting where, given a KB, we ask the model 100 questions in total, out of which 80 questions are answerable, and the other 20 are not.
Hardware Specification Yes The instruction tuning is performed on a single 80GB A100 GPU under bfloat16 without any parameter-efficient tuning methods.
Software Dependencies No For all experiments, we use the instruction fine-tuned version of Llama3 8B (Dubey et al., 2024) as the backbone LLM, and Open AI s ada-002 sentence embedding model (P = 1536) as the pre-trained encoder for computing base key and value embedding (Eq. (5)). The paper mentions specific models used (Llama3 8B, Open AI's ada-002) but does not provide specific version numbers for software libraries or programming languages required for replication.
Experiment Setup Yes Optimization is conducted using Adam W (Loshchilov, 2017) with a step size of 5 10 4 and a cosine learning rate decay to 5 10 6 for 20K iterations. Each iteration uses a mini-batch of 400 Q&A pairs, composed of 20 micro-batches of 20 samples.