Tslearn, A Machine Learning Toolkit for Time Series Data
Authors: Romain Tavenard, Johann Faouzi, Gilles Vandewiele, Felix Divo, Guillaume Androz, Chester Holtz, Marie Payne, Roman Yurchak, Marc Rußwurm, Kushal Kolar, Eli Woods
JMLR 2020 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The importance of providing time-series specific methods for machine learning is illustrated in the example below and the corresponding Figure 1, where standard Euclidean k-means fails while DTW-based ones (Sakoe and Chiba, 1978; Petitjean et al., 2011; Cuturi and Blondel, 2017) can distinguish between different time series profiles: from tslearn.clustering import Time Series KMeans from tslearn.datasets import Cached Datasets # Load the Trace data set X_train = Cached Datasets().load_dataset( Trace )[0] # Define parameters for each metric euclidean_params = { metric : euclidean } dba_params = { metric : dtw } sdtw_params = { metric : softdtw , metric_params : { gamma : .01}} # Perform clustering for each metric y_preds = [] for params in (euclidean_params, dba_params, sdtw_params): km = Time Series KMeans(n_clusters=3, random_state=0, **params) y_preds.append(km.fit_predict(X_train)) |
| Researcher Affiliation | Collaboration | Romain Tavenard EMAIL Universit e de Rennes, CNRS, LETG-Rennes, IRISA-Obelix, Rennes, France Johann Faouzi EMAIL Aramis Lab, INRIA Paris, Paris Brain Institute, Paris, France Gilles Vandewiele EMAIL IDLab, Ghent University imec, Ghent, Belgium Felix Divo EMAIL Technische Universit at Darmstadt, Darmstadt, Germany Guillaume Androz EMAIL Icentia Inc., Qu ebec, Canada Chester Holtz EMAIL Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA Marie Payne EMAIL Mc Gill University, Montreal, Qu ebec, Canada Roman Yurchak EMAIL Symerio, Paris, France Marc Rußwurm EMAIL Technical University of Munich, Chair of Remote Sensing Technology, Munich, Germany Kushal Kolar EMAIL Sars International Centre for Marine Molecular Biology, University of Bergen, Norway Eli Woods EMAIL Eaze Technologies, Inc., San Francisco, CA, USA |
| Pseudocode | No | The paper provides code snippets demonstrating the usage of the tslearn library's API, such as importing modules and fitting models, but it does not include structured pseudocode or algorithm blocks describing the underlying methods or procedures. |
| Open Source Code | Yes | tslearn is a general-purpose Python machine learning library for time series that offers tools for pre-processing and feature extraction as well as dedicated models for clustering, classification and regression. It follows scikit-learn s Application Programming Interface for transformers and estimators, allowing the use of standard pipelines and model selection tools on top of tslearn objects. It is distributed under the BSD-2-Clause license, and its source code is available at https://github.com/tslearn-team/tslearn. |
| Open Datasets | Yes | from tslearn.datasets import Cached Datasets # Load the Trace data set X_train = Cached Datasets().load_dataset( Trace )[0] |
| Dataset Splits | Yes | from sklearn.model_selection import KFold, Grid Search CV from tslearn.neighbors import KNeighbors Time Series Classifier knn = KNeighbors Time Series Classifier(metric="dtw") p_grid = {"n_neighbors": [1, 5]} cv = KFold(n_splits=2, shuffle=True, random_state=0) clf = Grid Search CV(estimator=knn, param_grid=p_grid, cv=cv) clf.fit(X, y) |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | Yes | tslearn v0.3.1 is a cross-platform software package for Python 3.5+. It depends on numpy (Van Der Walt et al., 2011) & scipy (Virtanen et al., 2020) packages for basic array manipulations and standard linear algebra routines and on scikit-learn (Pedregosa et al., 2011) for its API and utilities. It also utilizes Cython (Behnel et al., 2011), numba (Lam et al., 2015) and joblib (Varoquaux et al., 2010) for efficient computation. Finally, keras (Chollet et al., 2015) with tensorflow (Abadi et al., 2016) backend is an optional dependency that is necessary to use the shapelets module in tslearn that provides an efficient implementation of the shapelet model by Grabocka et al. (2014). |
| Experiment Setup | Yes | from sklearn.model_selection import KFold, Grid Search CV from tslearn.neighbors import KNeighbors Time Series Classifier knn = KNeighbors Time Series Classifier(metric="dtw") p_grid = {"n_neighbors": [1, 5]} cv = KFold(n_splits=2, shuffle=True, random_state=0) clf = Grid Search CV(estimator=knn, param_grid=p_grid, cv=cv) clf.fit(X, y) |