Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling
Authors: Skyler Wu, Fred Lu, Edward Raff, James Holt
NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our WRS approach on the Passive-Aggressive Classifier (PAC) and First-Order Sparse Online Learning (FSOL), where our method consistently and significantly outperforms the unmodified approach. |
| Researcher Affiliation | Collaboration | Skyler Wu Booz Allen Hamilton Harvard University Stanford University EMAIL Fred Lu Booz Allen Hamilton University of Maryland, Baltimore County EMAIL Edward Raff Booz Allen Hamilton University of Maryland, Baltimore County EMAIL James Holt Laboratory for Physical Sciences EMAIL |
| Pseudocode | Yes | Algorithm 1 WRS-Augmented Training (WAT) |
| Open Source Code | Yes | Code available at https://github.com/Future Computing4AI/Weighted-Reservoir-Sampling-Augmented-Training |
| Open Datasets | Yes | Table 1: Sizes, dimensions, and sparsities of all datasets used for numerical experiments. ... Avazu (App) [23] (via LIBSVM) ... [23] Steve Wang and Will Cukierski. Click-Through Rate Prediction. Kaggle. https://kaggle.com/competitions/avazu-ctr-prediction, 2014. |
| Dataset Splits | No | We perform a random 70/30 train-test split. The paper specifies train and test splits but does not explicitly mention a separate validation split or how it was used. |
| Hardware Specification | Yes | All experiments were run on a Linux computing cluster with 32 nodes, each with 40 Intel Xeon E5-2650 CPU cores and a total of 500 GB of RAM per node, managed using SLURM. |
| Software Dependencies | No | The paper mentions using libraries like scikit-learn implicitly through citations (e.g., 'Count Vectorizer from [31]'). However, it does not provide specific version numbers for these software components or other key dependencies required for replication. |
| Experiment Setup | Yes | We begin by tuning the Cerr, η, and λ hyperparameters for base PAC and FSOL, with details located in Appendix B. For PAC-WRS and FSOL-WRS, we use the hyperparameters for the corresponding base models, but try all possible WAT variants of weighting scheme (standard or exponential), averaging scheme (simple vs. weighted), voting-based zeroing (True or False), and reservoir size K {1, 4, 16, 64}. Appendix B.2 details the ranges for Cerr (10^-3 to 10^3) and η, λ (2^-3 to 2^9 for η, 0 to 10^3 for λ). |