Inductive Learning of Logical Theories with LLMs: A Expressivity-graded Analysis

Authors: João Pedro Gandarela de Souza, Danilo Carvalho, André Freitas

AAAI 2025 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results indicate that the largest LLMs can achieve competitive results against a SOTA Inductive Logic Programming (ILP) system baseline, but also that tracking long predicate relationship chains is a more difficult obstacle than theory complexity for LLMs. This paper presents a systematic methodology to evaluate the inductive learning properties (in the context of logic theory induction) of LLMs.
Researcher Affiliation Academia 1Idiap Research Institute 2National Biomarker Centre, CRUK-MI, University of Manchester 3Department of Computer Science, University of Manchester {firstname.lastname}@idiap.ch, {firstname.lastname}@manchester.ac.uk
Pseudocode Yes Algorithm 1: Iterative LM theory refinement
Open Source Code No The paper states: "Prompt templates used were included in the supplementary material1." and "A reusable and extensible framework for extending and assessing the inductive capabilities of LLMs." However, it does not provide an explicit statement or a direct link to the source code for the methodology described in the paper.
Open Datasets No The paper states: "In order to generate datasets for rigorous analysis, this study employed the Ru Da S tool (Cornelio and Thost 2021) to systematically vary parameters such as noise, open-world degree, and missing data." It cites a tool used to generate synthetic datasets but does not provide direct access information (link, DOI, repository) to the specific datasets used in the experiments.
Dataset Splits No The paper mentions: "The mean values reported are based on the results obtained from the train set and evaluated on the test set." While it implies train/test splits, it does not specify exact percentages, sample counts, or a detailed methodology for these splits.
Hardware Specification Yes For Popper, Llama3-8B-Instruct, Gemma-7B-It and Mixtral-8x7B-Instruct-v0.1, it was conducted on a computer with an Intel(R) Xeon(R) Gold 5217 CPU @ 3.00GHz, 188GB RAM, and 2x NVIDIA RTX A6000 (48GB VRAM) GPUs.
Software Dependencies Yes The software used was CUDA 12.3, Py Torch 2.2.2, and Transformers 4.41.2.
Experiment Setup Yes (1) employing Popper, with Nu WLS (Chu, Cai, and Luo 2023) and WMax CDCL, varying its time limit parameter from 10 to 800 seconds; (2) applying the proposed iterative LM theory refinement method (Section Proposed Approach ), with parameters Maxiter = 4 and MTthresh = 1.0.