Lazifying Conditional Gradient Algorithms
Authors: Gábor Braun, Sebastian Pokutta, Daniel Zink
JMLR 2019 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Complementing the theoretical analysis we report computational results demonstrating effectiveness of our approach via a significant reduction in wall-clock time compared to their linear optimization counterparts. ... 8. Experiments We implemented and compared the parameter-free variant of LCG (Algorithm 7) to the standard Frank Wolfe algorithm (CG), then Algorithm 4 (LPCG) to the Pairwise Conditional Gradient algorithm (PCG) of Garber and Meshi (2016), as well as Algorithm 8 (LOCG) to the Online Frank Wolfe algorithm (OCG) of Hazan and Kale (2012). ... In all cases we report significant speedups in wall-clock time often of several orders of magnitude. |
| Researcher Affiliation | Academia | G abor Braun EMAIL Sebastian Pokutta EMAIL Daniel Zink EMAIL School of Industrial & Systems Engineering Georgia Institute of Technology Atlanta, GA 30332, USA |
| Pseudocode | Yes | Algorithm 1 Frank Wolfe Algorithm Frank and Wolfe (1956) ... Algorithm 2 LPsep P (c, x, Φ, K) via LP oracle ... Algorithm 3 Lazy Conditional Gradient ... Algorithm 4 Lazy Pairwise Conditional Gradient (LPCG) ... Algorithm 5 Lazy Local Conditional Gradient (LLCG) ... Algorithm 6 Weak Local Separation LLPsep P (c, x, r, Φ, K) ... Algorithm 7 Parameter-free Lazy Conditional Gradient (LCG) ... Algorithm 8 Lazy Online Conditional Gradient (LOCG) ... Algorithm 9 Augmenting Weak Separation LPsep P (c, x, Φ, K) |
| Open Source Code | No | The paper states: We implemented all algorithms in Python 2.7 with critical functions cythonized for performance employing Numpy. We used these packages from the Anaconda 4.2.0 distribution as well as Gurobi 7.0 Gurobi Optimization (2016) as a black box solver for the linear optimization oracle. This describes the software used for implementation but does not state that the authors' specific code for their methodology is open-source or provide a link to it. |
| Open Datasets | Yes | Video colocalization. Video colocalization is the problem of identifying objects in a sequence of multiple frames in a video. In Joulin et al. (2014) it is shown that video colocalization can be reduced to optimizing a quadratic objective function over a flow or a path polytope, which is the problem we are going to solve. The resulting linear program is an instance of the minimum-cost network flow problem, see (Joulin et al., 2014, Eq. (3)) for the concrete linear program and more details. ... We tested the vanilla Frank Wolfe algorithm on the six video colocalization instances with underlying path polytopes from http://lime.cs.elte.hu/~kpeter/data/mcf/netgen/ (Figure 1). ... We also tested our algorithm on the quadratic unconstrained boolean optimization (QUBO) instances defined on Chimera graphs Dash (2013), which are available at http://researcher.watson.ibm.com/researcher/files/us-sanjeebd/chimera-data.zip. |
| Dataset Splits | No | The paper describes how problem instances were generated, e.g., for matrix completion: 'We generate the m n matrix A as the product of AL of size m r and AR of size r n. The entries in AL and AR are chosen from a standard Gaussian distribution. The set Ωis chosen uniformly of size s = min{5r(m+n r), 0.99mn }.' For online algorithms, it states: 'each experiment used a random sequence of 100 different random loss functions.' However, it does not explicitly provide information on how these generated instances or data were split into training, validation, or test sets in terms of percentages or counts for reproducing experiments. |
| Hardware Specification | Yes | All experiments were performed on a 16-core machine with Intel Xeon E5-2630 v3 @ 2.40GHz CPUs and 128GB of main memory. |
| Software Dependencies | Yes | We implemented all algorithms in Python 2.7 with critical functions cythonized for performance employing Numpy. We used these packages from the Anaconda 4.2.0 distribution as well as Gurobi 7.0 Gurobi Optimization (2016) as a black box solver for the linear optimization oracle. |
| Experiment Setup | Yes | Unless stated otherwise the weak separation oracle is implemented as sketched in Algorithm 2 through caching and early termination of the original LP oracle. We have used K = 1.1 and K = 1 as multiplicative factors for the weak separation oracle; for the impact of the choice of K see Section 8.2.2. ... The parameters for Gurobi were kept at their default settings except for enforcing the time limit of the tests and setting the acceptable duality gap to 10%, allowing Gurobi to terminate the linear optimization early avoiding the expensive proof of optimality. ... The stopping criteria for each of the experiments was a given wall clock time limit in seconds. ... In some cases... we reduced the cache size every 100 steps by keeping only the 100 points that were used the most so far. ... If not stated otherwise, we used (simple backtracking) line search. ... For the instances that need a very long time to solve the (approximate) linear optimization even once, we used a binary search for Φ0 for the lazy algorithms: starting from a conservative initial value, using the update rule Φ0 Φ0/2 until the separation oracle returns an improvement for the first time and then we start the algorithm with 2Φ0. |