reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling

Authors: Skyler Wu, Fred Lu, Edward Raff, James Holt

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our WRS approach on the Passive-Aggressive Classifier (PAC) and First-Order Sparse Online Learning (FSOL), where our method consistently and significantly outperforms the unmodified approach.
Researcher Affiliation	Collaboration	Skyler Wu Booz Allen Hamilton Harvard University Stanford University EMAIL Fred Lu Booz Allen Hamilton University of Maryland, Baltimore County EMAIL Edward Raff Booz Allen Hamilton University of Maryland, Baltimore County EMAIL James Holt Laboratory for Physical Sciences EMAIL
Pseudocode	Yes	Algorithm 1 WRS-Augmented Training (WAT)
Open Source Code	Yes	Code available at https://github.com/Future Computing4AI/Weighted-Reservoir-Sampling-Augmented-Training
Open Datasets	Yes	Table 1: Sizes, dimensions, and sparsities of all datasets used for numerical experiments. ... Avazu (App) [23] (via LIBSVM) ... [23] Steve Wang and Will Cukierski. Click-Through Rate Prediction. Kaggle. https://kaggle.com/competitions/avazu-ctr-prediction, 2014.
Dataset Splits	No	We perform a random 70/30 train-test split. The paper specifies train and test splits but does not explicitly mention a separate validation split or how it was used.
Hardware Specification	Yes	All experiments were run on a Linux computing cluster with 32 nodes, each with 40 Intel Xeon E5-2650 CPU cores and a total of 500 GB of RAM per node, managed using SLURM.
Software Dependencies	No	The paper mentions using libraries like scikit-learn implicitly through citations (e.g., 'Count Vectorizer from [31]'). However, it does not provide specific version numbers for these software components or other key dependencies required for replication.
Experiment Setup	Yes	We begin by tuning the Cerr, η, and λ hyperparameters for base PAC and FSOL, with details located in Appendix B. For PAC-WRS and FSOL-WRS, we use the hyperparameters for the corresponding base models, but try all possible WAT variants of weighting scheme (standard or exponential), averaging scheme (simple vs. weighted), voting-based zeroing (True or False), and reservoir size K {1, 4, 16, 64}. Appendix B.2 details the ranges for Cerr (10^-3 to 10^3) and η, λ (2^-3 to 2^9 for η, 0 to 10^3 for λ).