A Retrieve-and-Edit Framework for Predicting Structured Outputs
Authors: Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy S. Liang
NeurIPS 2018 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that on a new autocomplete task for Git Hub Python code and the Hearthstone cards benchmark, retrieve-and-edit significantly boosts the performance of a vanilla sequence-to-sequence model on both tasks. |
| Researcher Affiliation | Academia | Tatsunori B. Hashimoto Department of Computer Science Stanford University EMAIL Kelvin Guu Department of Statistics Stanford University EMAIL Yonatan Oren Department of Computer Science Stanford University EMAIL Percy Liang Department of Computer Science Stanford University EMAIL |
| Pseudocode | No | The paper describes the overall procedure in text (Section 3.1.4) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Reproducibility. Data and code used to generate the results of this paper are available on the Coda Lab Worksheets platform at https://worksheets.codalab.org/worksheets/ 0x1ad3f387005c492ea913cf0f20c9bb89/. |
| Open Datasets | Yes | Our Python autocomplete dataset is a representative sample of Python code from Git Hub, obtained from Google Bigquery by retrieving Python code containing at least one block comment with restructured text (re ST) formatting (See Appendix C for details). ... The Hearthstone cards benchmark consists of 533 cards in a computer card game, where each card is associated with a code snippet. The Hearthstone cards benchmark [22] |
| Dataset Splits | Yes | We also removed any duplicate function/docstring pairs and split the train and test set at the repository level. ... obtained by evaluating BLEU scores on the development set of both datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow). |
| Experiment Setup | Yes | Both the retriever and editor were trained for 1000 iterations on Hearthstone and 3000 on Git Hub via ADAM minibatch gradient descent, with batch size 16 and a learning rate of 0.001. |