e-PAL: An Active Learning Approach to the Multi-Objective Optimization Problem
Authors: Marcela Zuluaga, Andreas Krause, Markus Püschel
JMLR 2016 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Further, we perform an experimental evaluation on three real-world data sets that demonstrate ϵ-PAL s effectiveness; in comparison to the state-of-the-art active learning algorithm PAL, ϵ-PAL reduces the amount of computations and the number of samples from the design space required to meet the user s desired level of accuracy. In addition, we show that ϵ-PAL improves significantly over a state-of-the-art multi-objective optimization method, saving in most cases 30% to 70% evaluations to achieve the same accuracy. |
| Researcher Affiliation | Academia | Marcela Zuluaga EMAIL Department of Computer Science ETH Zurich Zurich, Switzerland Andreas Krause EMAIL Department of Computer Science ETH Zurich Zurich, Switzerland Markus P uschel EMAIL Department of Computer Science ETH Zurich Zurich, Switzerland |
| Pseudocode | Yes | Algorithm 1 The ϵ-PAL algorithm Input: design space E (|E| = n); GP prior µ0,i, σ0, ki for all 1 i n; ϵ; βt for t N Output: predicted Pareto set ˆP 1: P0 = {predicted set}, U0 = E {undecided set} 2: R0(x) = Rn for all x E, t = 1 3: repeat ... Algorithm 2 Discard points in U that are ϵ-dominated by ppess for m = 2. Input: set U; Pareto pessimistic set ppess; ϵ Output: updated set U 1: U = {(x, min(Rt(x)) + ϵ) : x ppess} {(x, max(Rt(x))) : x U \ ppess} {U is a set of (x, ( ˆf1, ˆf2)) pairs} 2: sort U = sort U with respect to ˆf1 in ascending order 3: current Max = 4: for (x, ( ˆf1, ˆf2)) in sort U do ... |
| Open Source Code | Yes | For our experimental evaluation, we implemented a prototype that is restricted to m = 2 objectives. The code is available at Zuluaga et al. (2015). ... M. Zuluaga, A. Krause, and M. P uschel. e-PAL source code, 2015. URL http://www.spiral.net/software/pal.html. |
| Open Datasets | Yes | The first data set, called SNW, is taken from Zuluaga et al. (2012b). The second data set, called No C, is taken from Almer et al. (2011). The third data set, called SW-LLVM, is taken from Siegmund et al. (2012). |
| Dataset Splits | No | Prior to running ϵ-PAL, a subset of the design space was evaluated in order to initialize the training set and optimize the kernel parameters used by the model. For SNW and No C, 15 points were chosen uniformly at random. For LLVM-SW, as the dimensionality of the design space is much larger, 30 points were taken. The paper describes initial sampling for model initialization but not explicit train/test/validation splits of the datasets themselves. |
| Hardware Specification | Yes | All of the experiments were run on a machine equipped with two 6-Core Intel Xeon X5680 (3.06GHz) and 144GB (1333MHz) of memory. |
| Software Dependencies | Yes | Our implementation of ϵ-PAL uses the Gaussian Process Regression and Classification Toolbox for Matlab (Rasmussen and Nickisch, 2013). ... C. E. Rasmussen and H. Nickisch. Gaussian process regression and classification toolbox Version 3.2 for Matlab 7.x, 2013. |
| Experiment Setup | Yes | The standard deviation of the noise ν is fixed to 0.1. We use 2 log(m|E|π2t2/(6δ)), which is the value suggested in Zuluaga et al. (2013) scaled by 1/3. This factor is empirical and used because the theoretical analysis is too conservative. In Srinivas et al. (2010) 1/5 is used. All of the experiments were repeated 200 times and the median of the outcomes is shown in the plots. Additionally, several values of ϵ = (ϵi)1 i 2 were evaluated, where ϵi is proportional to the range ri = maxx E{fi(x)} minx E{fi(x)}. We start with ϵi = 0.01 ri and increase it up to ϵi = 0.3 ri. When we say ϵ = 1%, we mean that ϵi is 1% of the range ri. Prior to running ϵ-PAL, a subset of the design space was evaluated in order to initialize the training set and optimize the kernel parameters used by the model. For SNW and No C, 15 points were chosen uniformly at random. For LLVM-SW, as the dimensionality of the design space is much larger, 30 points were taken. |