reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SEAD: Unsupervised Ensemble of Streaming Anomaly Detectors

Authors: Saumya Gaurang Shah, Abishek Sankararaman, Balakrishnan Murali Narayanaswamy, Vikramank Singh

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on 14 non-trivial public datasets and an internal dataset corroborate our claims.
Researcher Affiliation	Industry	1Amazon Web Services, Santa Clara, CA, USA. Correspondence to: Saumya Gaurang Shah <EMAIL>.
Pseudocode	Yes	The complete pseudo code is in Algorithm 1. Algorithm 1 SEAD Algorithm Algorithm 2 SEAD ++: Optimizing runtime by sampling
Open Source Code	No	The paper mentions using "open source implementations from Py SAD (Yilmaz & Kozat, 2020)" for base models but does not state that the code for SEAD itself is open-source, nor does it provide a link.
Open Datasets	Yes	We perform experiments on 15 datasets, of which 11 are from the Outlier Detection Data Sets (ODDS) (Rayana, 2016), 3 are from the USP Data Stream Repository (Souza et al., 2020) and one is an internal telemetry dataset from a multiserver database cloud service.
Dataset Splits	Yes	We set the first 100 data points for warm starting the base models and SEAD , but not for evaluation, i.e., is cold start . To overcome this issue, we split each dataset into chunks of 50 contiguous data points.
Hardware Specification	Yes	We performed all experiments on a single c5.2xlarge AWS EC2 instance.
Software Dependencies	No	The paper mentions using "open source implementations from Py SAD (Yilmaz & Kozat, 2020)" and "tdigest (Dunning & Ertl, 2019)" but does not provide specific version numbers for these or other software libraries/dependencies.
Experiment Setup	Yes	For our method SEAD , we choose hyperparameters η = 1, λ = 10 6 and π = Uniform distribution across all experiments. Table 11. Hyperparameter configurations for the base models. We set the first 100 data points for warm starting the base models and SEAD