reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Select and Augment: Enhanced Dense Retrieval Knowledge Graph Augmentation

Authors: Micheal Abaho, Yousef H. Alfaifi

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiment results for Link Prediction demonstrate a 5.5% and 3.5% percentage increase in the Mean Reciprocal Rank (MRR) and Hits@10 scores respectively, in comparison to text-enhanced knowledge graph augmentation methods using traditional CNNs. Our proposed method is evaluated on KG completion tasks (described under Section 4.3) such as link and relation prediction using Freebase FB15k dataset Veira et al. (2019).
Researcher Affiliation	Academia	Micheal Abaho EMAIL University of Liverpool, United Kingdom Yousef H. Alfaifi EMAIL Faculty of Computers and Information Technology, University of Tabuk, Tabuk, Saudi Arabia
Pseudocode	No	The paper describes the methodology using text and mathematical equations, but it does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	We adopt the Freebase (FB15K) Knowledge graph, the Babelnet corpus (Navigli & Ponzetto, 2012), Google News dataset, and Wikipedia articles (Veira et al., 2019), from which we assemble text descriptions for all the KG entities.
Dataset Splits	Yes	After pre-processing the gathered entity descriptions, the dataset is split into training, validation and test sets) and the resultant dataset statistics are presented in Table 4. Table 4 provides a breakdown of the Train, Validation (Val) and Test splits used in our experiments and the 3.8M entity descriptions which contain 96% of the entities within the KB. As shown in the table, the text descriptions respectively cover 51.8%, 30.4% and 29.9% of the Training, Validation and Testing triples. Table 1: Dataset statistics for both KB and text corpus FB15K 1,341 14,904 472,860 / 57,803 / 48,991 Text 3814190 14,308 244946 / 17572 / 14599
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions using a 'pre-trained SBERT model' and initializing KGE models like 'Trans E', 'Dist Mult', 'Comp IEx', and 'Rotat E'. However, it does not specify software dependencies like programming languages or library versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The number of negative samples per triple is set to 100 and k is set to 5. We tune all hyper-parameters using the validation data, and obtain optimal values as follows: learning rate 1e-3, batch size 8, KG embedding size 200. Further details on tuning bounds are provided in Table 4. Table 2: Parameter settings for DRKA KG Embedding dimension [50,100,200,300] Optimal 200 γ [0.5,1.0,1.5,2.0] Optimal 1.0 Optimizer [SGD,Adam] Optimal Adam Epochs [20, 50, 70, 100, 120] Optimal 70 Learning rate [5e-4, 1e-4, 5e-3, 1e-3, 5e-2, 1e-2] Optimal 1e-3