reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scikit-Multiflow: A Multi-output Streaming Framework

Authors: Jacob Montiel, Jesse Read, Albert Bifet, Talel Abdessalem

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	scikit-multiﬂow is a framework for learning from data streams and multi-output learning in Python. Conceived to serve as a platform to encourage the democratization of stream learning research, it provides multiple state-of-the-art learning methods, data generators and evaluators for diﬀerent stream learning problems, including single-output, multi-output and multi-label.
Researcher Affiliation	Academia	Jacob Montiel EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay Paris, FRANCE Jesse Read EMAIL LIX, Ecole Polytechnique Palaiseau, FRANCE Albert Bifet EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay Paris, FRANCE Talel Abdessalem EMAIL LTCI, T el ecom Paris Tech, Universit e Paris-Saclay Paris, FRANCE UMI CNRS IPAL, National University of Singapore
Pseudocode	Yes	The sequence to train a Stream Model and track its performance using prequential evaluation in scikit-multiﬂow is outlined in Figure 1. Figure 1: Training and testing a stream model using scikit-multiﬂow. This sequence corresponds to prequential evaluation. [while there is data in the stream] 1 : evaluate(stream, model) 2 : get next sample 3 : X, y_true = next sample 4 : predict(X) 5 : y_predicted = Prediction 6 : results = evaluate(y_true, y_predicted) 7 [m samples passed] : update_metrics(last_result) 8 [m samples passed] : update_plot(last_result) 9 : partial_fit(X) 10 : trained model
Open Source Code	Yes	The source code is available at https://github.com/scikit-multiflow/scikit-multiflow. The source code of the package is publicly available on Github at https://github.com/scikit-multiflow/scikit-multiflow
Open Datasets	No	The paper describes 'Stream generators' such as Agrawal, Hyperplane, Led, Mixed, Random RBF, etc., which are features of the scikit-multiflow framework for generating data streams. However, it does not explicitly state the use of, or provide access information for, any specific publicly available datasets for experimental evaluation within this paper.
Dataset Splits	No	The paper describes a software framework and its capabilities, including 'Available evaluators correspond to Prequential and Hold-Out evaluations'. However, it does not detail specific experiments conducted with datasets or specify any training/test/validation splits for reproduction.
Hardware Specification	No	The paper describes a software framework and its functionalities. It does not provide any specific details about the hardware used for development or for running any potential experiments.
Software Dependencies	No	scikit-multiﬂow builds upon popular open source frameworks including scikitlearn, MOA and MEKA. Development follows the FOSS principles. scikit-learn (Pedregosa et al., 2011) is the most popular open source software machine learning library for the Python programming language. It features various classiﬁcation, regression and clustering algorithms including support vector machines, random forest, gradient boosting, k-means and DBSCAN, and is designed to inter-operate with the Python numerical and scientiﬁc packages Num Py and Sci Py. The paper mentions several software frameworks and programming languages but does not provide specific version numbers for any of them (e.g., Python version, scikit-learn version, etc.).
Experiment Setup	No	The paper focuses on describing the scikit-multiﬂow framework, its architecture, and available methods and evaluators. It does not include a section detailing specific experimental setups, hyperparameters, or training configurations for any models or experiments.