Learning Symbolic Rules for Reasoning in Quasi-Natural Language

Authors: Kaiyu Yang, Jia Deng

TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our method on 3 tasks: learning compositional instructions, logical reasoning, and morphological analysis. For compositional instructions, our method not only achieves 100% accuracy on Mini SCAN (Lake et al., 2019) and SCAN (Lake & Baroni, 2018), but also recovers the ground truth rules. For logical reasoning, it achieves state-of-the-art performance on Rule Taker (Clark et al., 2020), including the noisy data paraphrased by crowd workers. For morphological analysis, it learns morphological rules from real-world linguistic data and is competitive with neural seq2seq models in some languages.
Researcher Affiliation Academia Kaiyu Yang EMAIL Department of Computer Science Princeton University Jia Deng EMAIL Department of Computer Science Princeton University
Pseudocode Yes Algorithm 1: Meta Induce Input : Training data Dtrain = {(Ai, gi)}n i=1; Ai is the assumptions; gi is the goal. Output: Model M consisting of a set of rules
Open Source Code Yes The code is available at https://github.com/princeton-vl/MetaQNL.jl.
Open Datasets Yes We instantiate Meta QNL/Meta Induce on three tasks: learning compositional instructions on Mini SCAN (Lake et al., 2019)/SCAN (Lake & Baroni, 2018), logical reasoning on Rule Taker (Clark et al., 2020), and morphological analysis on SIGMORPHON 2018 (Cotterell et al., 2018).
Dataset Splits Yes For SCAN, we train only on the 400 shortest examples and test on four different splits: simple, length, addprim_jump, and addprim_turn_left. ... For each language, they sample a training set of 1K examples and three test sets of 100 examples each (FUT, PST, and OTHER).
Hardware Specification Yes On machines with 0 GPUs, 32GB RAM, and 4 CPUs, we run Meta Induce for 5 epochs on 10K training examples, which takes about 20 hours. ... Our experiments take 30 minutes to run on a laptop
Software Dependencies No We use backward chaining as the prover and Z3 (De Moura & Bjørner, 2008) as the MAX-SAT solver. ... The soft matching network is implemented by finetuning a T5 model (Raffel et al., 2020). ... using the Adam W optimizer (Loshchilov & Hutter, 2019).
Experiment Setup Yes We use forward chaining as the prover and a depth limit of 7. The hyperparameters λ+ and λ are tuned on validation data. ... We finetune the model with a learning rate of 10 4 and a batch size of 32 using the Adam W optimizer (Loshchilov & Hutter, 2019).