TemporalDictionaryEnsemble¶
-
class
sktime.classification.dictionary_based.
TemporalDictionaryEnsemble
(n_parameter_samples=250, max_ensemble_size=50, time_limit=0.0, max_win_len_prop=1, min_window=10, randomly_selected_params=50, bigrams=None, dim_threshold=0.85, max_dims=20, n_jobs=1, random_state=None)[source]¶ Temporal Dictionary Ensemble (TDE) as described in [1].
Overview: Input n series length m with d dimensions TDE searches k parameter values selected using a Gaussian processes regressor, evaluating each with a LOOCV. It then retains s ensemble members. There are six primary parameters for individual classifiers:
alpha: alphabet size w: window length l: word length p: normalise/no normalise h: levels b: MCB/IGB
for any combination, an individual TDE classifier slides a window of length w along the series. The w length window is shortened to an l length word through taking a Fourier transform and keeping the first l/2 complex coefficients. These lcoefficients are then discretised into alpha possible values, to form a word length l using breakpoints found using b. A histogram of words for each series is formed and stored, using a spatial pyramid of h levels. For multivariate series, accuracy from a reduced histogram is used to select dimensions.
fit involves finding n histograms. predict uses 1 nearest neighbour with a the histogram intersection distance function.
For the original Java version, see https://github.com/uea-machine-learning/tsml/blob/master/src/main/java/ tsml/classifiers/dictionary_based/TDE.java
- Parameters
n_parameter_samples (int, number of parameter combos to try) –
(default=250) –
max_ensemble_size (int, maximum number of classifiers) –
(default=50) –
time_limit (int, time contract to limit build time in) –
(default=0 (minutes) –
limit) (no) –
max_win_len_prop (float between 0 and 1, maximum window length) –
a proportion of series length (default=1) (as) –
min_window (int, minimum window size (default=10)) –
randomly_selected_params (int, number of parameters randomly selected) –
GP is used (default=50) (before) –
bigrams (boolean or None, whether to use bigrams) –
(default=None –
for univariate (true) –
for multivariate) (false) –
dim_threshold (float between 0 and 1, dimension accuracy) –
for multivariate (default=0.85) (threshold) –
max_dims (int, max number of dimensions for multivariate) –
(default=20) –
n_jobs (int, optional (default=1)) –
number of jobs to run in parallel for both fit and predict. (The) –
means using all processors. (-1) –
(default to no seed) (optional) –
-
(<=max_ensemble_size)
Notes
- ..[1] Matthew Middlehurst, James Large, Gavin Cawley and Anthony Bagnall
“The Temporal Dictionary Ensemble (TDE) Classifier for Time Series Classification”,
in proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020
Java version https://github.com/uea-machine-learning/tsml/blob/master/src/main/java/ tsml/classifiers/dictionary_based/TDE.java