Synthesizing Libraries of Programs with Auxiliary Functions
Authors: Habibur Rahman, Thirupathi Reddy Emireddy, Kenneth Tjhia, Elham Parhizkar, Levi Lelis
TMLR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Aulile in string manipulation tasks. Aulile improved, in some cases by a large margin, the performance of several base algorithms that use different search and learning strategies: BUS, Bustle, Crossbeam, and Bee Search. Our results suggest that Aulile offers an effective method of injecting domain knowledge into existing systems through a library learning scheme that optimizes for an auxiliary function. We perform three sets of experiments. In the first set, we compare the base synthesizers with their augmented counterparts (Figures 3 and 4). We compare the algorithms in terms of the number of problems solved by the number of programs evaluated. In the second set, we compare the best performing systems of each experiment in the first set (Figure 5). Since the algorithms use different models, we use running time instead of number of evaluations. This is to account for the complexity of the models employed. For example, Crossbeam uses a more complex and thus slower model than Bustle. We used 14 million programs evaluated as the budget B of Aulile for A-BUS, A-Bustle, and A-Bee. In the third set, we perform ablation studies. |
| Researcher Affiliation | Academia | Habibur Rahman EMAIL Amii, Department of Computing Science, University of Alberta |
| Pseudocode | Yes | Algorithm 1 Aulile Require: Input-output pairs I, O, DSL G, auxiliary function a, base synthesizer Y , and a budget B Ensure: Solution program p or |
| Open Source Code | Yes | Our implementation is available online.1 1https://github.com/lelis-research/aulile |
| Open Datasets | Yes | We evaluate Aulile on string manipulation (Alur et al., 2013; Odena et al., 2021) tasks, in which one is given a set of input-output examples and needs to find a program that maps each input to its corresponding output. We use two datasets from Odena et al. (2021): one with 89 instances of the Sy Gu S competition and another with 38 handcrafted instances. |
| Dataset Splits | No | We evaluate Aulile on string manipulation (Alur et al., 2013; Odena et al., 2021) tasks, in which one is given a set of input-output examples and needs to find a program that maps each input to its corresponding output. We use two datasets from Odena et al. (2021): one with 89 instances of the Sy Gu S competition and another with 38 handcrafted instances. The paper does not specify training, validation, or test splits for these datasets within its experimental setup. |
| Hardware Specification | No | We used 1 CPU at 2.4 GHz and 64 GB of RAM in all experiments; for Bustle, A-Bustle, Bee Search, A-Bee, Crossbeam, and A-Crossbeam we also used 1 GPU, since they use a neural model. Specific CPU/GPU models are not mentioned. |
| Software Dependencies | No | We implemented the DSL for string manipulation, as well as BUS and Bustle. We use the implementations of Crossbeam and Bee Search provided by their authors. No specific version numbers for software or libraries are provided. |
| Experiment Setup | Yes | We used 14 million programs evaluated as the budget B of Aulile for A-BUS, A-Bustle, and A-Bee. In the second set of experiments, we used 5 million evaluations as B for A-Crossbeam. We plot the mean and standard deviation over five independent runs for Bustle, A-Bustle, Bee Search, A-Bee, Crossbeam, and A-Crossbeam because they depend on the random initialization of the weights of their neural models and also sampling in the case of Crossbeam and A-Crossbeam. BUS and A-BUS are deterministic, therefore we present the results of a single run. We use the Levenshtein distance (Levenshtein, 1966) as the auxiliary function. Figure 6 shows the results of A-BUS when it adds the best k programs to the language in each Aulile iteration, for k in {1, 2, 3}. We define different auxiliary functions that are based on the Levenshtein distance. This is achieved with a parameter 0.0 < l <= 1.0. |