Learning Symbolic Rules for Reasoning in Quasi-Natural Language
Authors: Kaiyu Yang, Jia Deng
TMLR 2023 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark our method on 3 tasks: learning compositional instructions, logical reasoning, and morphological analysis. For compositional instructions, our method not only achieves 100% accuracy on Mini SCAN (Lake et al., 2019) and SCAN (Lake & Baroni, 2018), but also recovers the ground truth rules. For logical reasoning, it achieves state-of-the-art performance on Rule Taker (Clark et al., 2020), including the noisy data paraphrased by crowd workers. For morphological analysis, it learns morphological rules from real-world linguistic data and is competitive with neural seq2seq models in some languages. |
| Researcher Affiliation | Academia | Kaiyu Yang EMAIL Department of Computer Science Princeton University Jia Deng EMAIL Department of Computer Science Princeton University |
| Pseudocode | Yes | Algorithm 1: Meta Induce Input : Training data Dtrain = {(Ai, gi)}n i=1; Ai is the assumptions; gi is the goal. Output: Model M consisting of a set of rules |
| Open Source Code | Yes | The code is available at https://github.com/princeton-vl/MetaQNL.jl. |
| Open Datasets | Yes | We instantiate Meta QNL/Meta Induce on three tasks: learning compositional instructions on Mini SCAN (Lake et al., 2019)/SCAN (Lake & Baroni, 2018), logical reasoning on Rule Taker (Clark et al., 2020), and morphological analysis on SIGMORPHON 2018 (Cotterell et al., 2018). |
| Dataset Splits | Yes | For SCAN, we train only on the 400 shortest examples and test on four different splits: simple, length, addprim_jump, and addprim_turn_left. ... For each language, they sample a training set of 1K examples and three test sets of 100 examples each (FUT, PST, and OTHER). |
| Hardware Specification | Yes | On machines with 0 GPUs, 32GB RAM, and 4 CPUs, we run Meta Induce for 5 epochs on 10K training examples, which takes about 20 hours. ... Our experiments take 30 minutes to run on a laptop |
| Software Dependencies | No | We use backward chaining as the prover and Z3 (De Moura & Bjørner, 2008) as the MAX-SAT solver. ... The soft matching network is implemented by finetuning a T5 model (Raffel et al., 2020). ... using the Adam W optimizer (Loshchilov & Hutter, 2019). |
| Experiment Setup | Yes | We use forward chaining as the prover and a depth limit of 7. The hyperparameters λ+ and λ are tuned on validation data. ... We finetune the model with a learning rate of 10 4 and a batch size of 32 using the Adam W optimizer (Loshchilov & Hutter, 2019). |