Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling

Authors: Skyler Wu, Fred Lu, Edward Raff, James Holt

NeurIPS 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our WRS approach on the Passive-Aggressive Classifier (PAC) and First-Order Sparse Online Learning (FSOL), where our method consistently and significantly outperforms the unmodified approach.
Researcher Affiliation Collaboration Skyler Wu Booz Allen Hamilton Harvard University Stanford University EMAIL Fred Lu Booz Allen Hamilton University of Maryland, Baltimore County EMAIL Edward Raff Booz Allen Hamilton University of Maryland, Baltimore County EMAIL James Holt Laboratory for Physical Sciences EMAIL
Pseudocode Yes Algorithm 1 WRS-Augmented Training (WAT)
Open Source Code Yes Code available at https://github.com/Future Computing4AI/Weighted-Reservoir-Sampling-Augmented-Training
Open Datasets Yes Table 1: Sizes, dimensions, and sparsities of all datasets used for numerical experiments. ... Avazu (App) [23] (via LIBSVM) ... [23] Steve Wang and Will Cukierski. Click-Through Rate Prediction. Kaggle. https://kaggle.com/competitions/avazu-ctr-prediction, 2014.
Dataset Splits No We perform a random 70/30 train-test split. The paper specifies train and test splits but does not explicitly mention a separate validation split or how it was used.
Hardware Specification Yes All experiments were run on a Linux computing cluster with 32 nodes, each with 40 Intel Xeon E5-2650 CPU cores and a total of 500 GB of RAM per node, managed using SLURM.
Software Dependencies No The paper mentions using libraries like scikit-learn implicitly through citations (e.g., 'Count Vectorizer from [31]'). However, it does not provide specific version numbers for these software components or other key dependencies required for replication.
Experiment Setup Yes We begin by tuning the Cerr, η, and λ hyperparameters for base PAC and FSOL, with details located in Appendix B. For PAC-WRS and FSOL-WRS, we use the hyperparameters for the corresponding base models, but try all possible WAT variants of weighting scheme (standard or exponential), averaging scheme (simple vs. weighted), voting-based zeroing (True or False), and reservoir size K {1, 4, 16, 64}. Appendix B.2 details the ranges for Cerr (10^-3 to 10^3) and η, λ (2^-3 to 2^9 for η, 0 to 10^3 for λ).