CoCoA: A General Framework for Communication-Efficient Distributed Optimization
Authors: Virginia Smith, Simone Forte, Chenxin Ma, Martin Takáč, Michael I. Jordan, Martin Jaggi
JMLR 2017 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we demonstrate the empirical performance of Co Co A in the distributed setting. We first compare Co Co A to competing methods for two common machine learning applications: lasso regression (Section 6.1) and support vector machine (SVM) classification (Section 6.2). We then explore the performance of Co Co A in the primal versus the dual directly by solving an elastic net regression model with both variants (Section 6.3). Finally, we illustrate general properties of the Co Co A method empirically in Section 6.4. |
| Researcher Affiliation | Academia | Virginia Smith EMAIL Department of Computer Science Stanford University Stanford, CA 94305, USA; Simone Forte EMAIL Department of Computer Science ETH Zürich 8006 Zürich, Switzerland; Chenxin Ma EMAIL Industrial and Systems Engineering Department Lehigh University Bethlehem, PA 18015, USA; Martin Takáč EMAIL Industrial and Systems Engineering Department Lehigh University Bethlehem, PA 18015, USA; Michael I. Jordan EMAIL Division of Computer Science and Department of Statistics University of California Berkeley, CA 94720, USA; Martin Jaggi EMAIL School of Computer and Communication Sciences EPFL 1015 Lausanne, Switzerland |
| Pseudocode | Yes | Algorithm 1 Generalized Co Co A Distributed Framework; Algorithm 2 Co Co A-Primal (Mapping Problem (I) to (A)); Algorithm 3 Co Co A-Dual (Mapping Problem (I) to (B)) |
| Open Source Code | Yes | All algorithms for comparison are implemented in Apache Spark and run on Amazon EC2 clusters. Our code is available at: gingsmith.github.io/cocoa/. |
| Open Datasets | Yes | Table 5: Datasets for Empirical Study includes: url, epsilon, kddb, webspam. We demonstrate these gains with an extensive experimental comparison on real-world distributed datasets. |
| Dataset Splits | No | The paper provides 'Training Size' and 'Feature Size' in Table 5, but does not specify explicit training/test/validation splits (e.g., percentages or counts) or reference standard splits. |
| Hardware Specification | Yes | All experiments are run on Amazon EC2 clusters of m3.xlarge machines with one core per machine. |
| Software Dependencies | Yes | All experiments are run on Amazon EC2 clusters of m3.xlarge machines, with one core per machine. The code for each method is written in Apache Spark, v1.5.0. Our code is open source and publicly available at gingsmith.github.io/cocoa/. Mb-SGD: Mini-batch stochastic gradient. For our experiments with lasso, we compare against Mb-SGD with an L1-prox. The first three methods are optimized and implemented in Apache Spark s MLlib (v1.5.0) (Meng et al., 2016). |
| Experiment Setup | Yes | We carefully tune each competing method in our experiments for best performance. ADMM requires the most tuning, both in selecting the penalty parameter ρ and in solving the subproblems... For Mb-SGD, we tune the step size and mini-batch size parameters. For Mb-CD and Mb-SDCA, we scale the updates at each round by β/b for mini-batch size b and β ∈ [1, b], and tune both parameters b and β. The only parameter influencing the overall performance of Co Co A is the level of approximation quality, which we parameterize in the experiments through H, the number of local iterations of the iterative method run locally. |