Neural Task Synthesis for Visual Programming
Authors: Victor-Alexandru Pădurean, Georgios Tzannetos, Adish Singla
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of Neur Task Syn through an extensive empirical evaluation and a qualitative study on reference tasks taken from the Hour of Code: Classic Maze challenge by Code.org and the Intro to Programming with Karel course by Code HS.com. |
| Researcher Affiliation | Academia | Victor-Alexandru Pădurean EMAIL Max Planck Institute for Software Systems Georgios Tzannetos EMAIL Max Planck Institute for Software Systems Adish Singla EMAIL Max Planck Institute for Software Systems |
| Pseudocode | Yes | Algorithm 1: Specification Dataset Collection Procedure |
| Open Source Code | Yes | We publicly release the implementation and datasets to facilitate future research.1 1Git Hub repository: https://github.com/machine-teaching-group/tmlr2024_neurtasksyn. |
| Open Datasets | Yes | We publicly release the implementation and datasets to facilitate future research.1 1Git Hub repository: https://github.com/machine-teaching-group/tmlr2024_neurtasksyn. |
| Dataset Splits | Yes | In our evaluation, we split D as follows: 80% for training the neural models (Dtrain), 10% for calibration (Dcal), and a fixed 10% for evaluation (Dtest). |
| Hardware Specification | Yes | All the experiments were conducted on a cluster of machines equipped with Intel Xeon Gold 6142 CPUs running at a frequency of 2.60GHz. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | Hyper Parameter Ho CMaze Karel Max. Epochs 60 100 Batch Size 32 32 Learning Rate 5 10 4 5 10 4 Dict. Size 59 59 Max. Blocks 17 17 (c) Hyperparameters Figure 18: Illustration of training details for the code generator. (a) and (b) show the training curves with mean epoch loss and validation performance, based on metric M, for both the Ho CMaze and Karel domains. (c) shows the hyperparameters employed for the code generator training. |