Learning to Recombine and Resample Data For Compositional Generalization

Authors: Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas

ICLR 2021 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018). Our experiments are designed to explore the effectiveness of learned data recombination procedures in controlled and natural settings.
Researcher Affiliation Academia Ekin Akyürek MIT CSAIL EMAIL Afra Feyza Akyürek Boston University EMAIL Jacob Andreas MIT CSAIL EMAIL
Pseudocode No The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes Code for all experiments in this paper is available at https://github.com/ekinakyurek/compgen.
Open Datasets Yes We evaluate R&R on two tests of compositional generalization: the SCAN instruction following task (Lake & Baroni, 2018) and a few-shot morphology learning task derived from the SIGMORPHON 2018 dataset (Kirov et al., 2018; Cotterell et al., 2018).
Dataset Splits Yes We construct splits of the data featuring a training set of 1000 examples and three test sets of 100 examples. ... we construct five different splits per language and use the Spanish past-tense data as a development set.
Hardware Specification Yes We use a single 32GB NVIDIA V100 Volta GPU for each experiment.
Software Dependencies Yes We implemented our experiments in Knet (Yuret, 2016) using Julia (Bezanson et al., 2017).
Experiment Setup Yes Morphology: The hidden and embedding sizes are 1024. No dropout is applied. ... SCAN: We choose the hidden size as 512, and embedding size as 64. We apply 0.5 dropout to the input.