KBLaM: Knowledge Base augmented Language Model
Authors: Xi Wang, Taketomo Isazawa, Liana Mikaelyan, James Hensman
ICLR 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate KBLAM s effectiveness in various tasks, including question-answering and open-ended reasoning, while providing interpretable insights into its use of the augmented knowledge. In this section, we perform empirical evaluation for KBLAM. |
| Researcher Affiliation | Collaboration | Xi Wang Johns Hopkins University EMAIL Taketomo Isazawa* Microsoft Research EMAIL Liana Mikaelyan Microsoft EMAIL James Hensman Microsoft Research EMAIL |
| Pseudocode | No | The paper describes methods and processes verbally and with diagrams (Figure 2), but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | Code and datasets are available at https://github.com/microsoft/KBLa M/ |
| Open Datasets | Yes | Code and datasets are available at https://github.com/microsoft/KBLa M/ Lastly, we release our training and evaluation KBs, which can help future research in augmenting LLM with KB, as well as in other topics such as long-context language models, hallucination detection/reduction, and structured attention. Enron A KB constructed from the Enron (Klimt & Yang, 2004) dataset, an open-sourced corporate email dataset. |
| Dataset Splits | Yes | To construct each training sample, we perform the following procedure: We randomly select a subset of 10 to 100 triples from the synthetic KB to form a sample-specific KB. For evaluation, we considered the following two KB datasets3 Synthetic data The validation set of the synthetic KB, i.e. the 15000 triples not used for training. We consider a setting where, given a KB, we ask the model 100 questions in total, out of which 80 questions are answerable, and the other 20 are not. |
| Hardware Specification | Yes | The instruction tuning is performed on a single 80GB A100 GPU under bfloat16 without any parameter-efficient tuning methods. |
| Software Dependencies | No | For all experiments, we use the instruction fine-tuned version of Llama3 8B (Dubey et al., 2024) as the backbone LLM, and Open AI s ada-002 sentence embedding model (P = 1536) as the pre-trained encoder for computing base key and value embedding (Eq. (5)). The paper mentions specific models used (Llama3 8B, Open AI's ada-002) but does not provide specific version numbers for software libraries or programming languages required for replication. |
| Experiment Setup | Yes | Optimization is conducted using Adam W (Loshchilov, 2017) with a step size of 5 10 4 and a cosine learning rate decay to 5 10 6 for 20K iterations. Each iteration uses a mini-batch of 400 Q&A pairs, composed of 20 micro-batches of 20 samples. |