Learning Sparse Low-Threshold Linear Classifiers
Authors: Sivan Sabato, Shai Shalev-Shwartz, Nathan Srebro, Daniel Hsu, Tong Zhang
JMLR 2015 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that we can learn efficiently in this setting, at a rate which is linear in both k and the size of the threshold, and that this is the best possible rate. We provide an efficient online learning algorithm that achieves the optimal rate, and show that in the batch case, empirical risk minimization achieves this rate as well. The rates we show are tighter than the uniform convergence rate, which grows with k2. |
| Researcher Affiliation | Academia | Sivan Sabato EMAIL Ben-Gurion University of the Negev Beer Sheva, 8410501, Israel. Shai Shalev-Shwartz EMAIL Benin School of Computer Science and Engineering The Hebrew University Givat Ram, Jerusalem 91904, Israel. Nathan Srebro EMAIL Toyota Technological Institute at Chicago 6045 S. Kenwood Ave. Chicago, IL 60637. Daniel Hsu EMAIL Department of Computer Science Columbia University 1214 Amsterdam Avenue, #0401 New York, NY 10027. Tong Zhang EMAIL Department of Statistics Rutgers University Piscataway, NJ 08854. |
| Pseudocode | Yes | Unnormalized Exponentiated Gradient (unnormalized-EG) parameters: η, λ > 0 input: z1, . . . , z T Rd initialize: w1 = (λ, . . . , λ) Rd update rule: i, wt+1[i] = wt[i]e ηzt[i] |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | The paper focuses on theoretical bounds and algorithms for learning linear classifiers in abstract spaces like [0,1]^d or {0,1}^d. It does not refer to any specific named datasets or provide access information for empirical data. |
| Dataset Splits | No | The paper is theoretical and does not present empirical results from experiments using specific datasets, therefore no information regarding dataset splits is provided. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical proofs and algorithmic analysis. It does not describe any hardware used for experiments. |
| Software Dependencies | No | The paper discusses theoretical algorithms and their properties. It does not mention any specific software libraries, packages, or their version numbers required to implement or reproduce the work. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm analysis and complexity bounds. It does not describe any experimental setup details such as hyperparameter values, training configurations, or system-level settings. |