sktime.transformations.panel.shapelets

shapelet transformations transformer from the time domain into the shapelet domain. Standard full transform, a contracted version and a randoms sampler

class sktime.transformations.panel.shapelets.ContractedShapeletTransform(min_shapelet_length=3, max_shapelet_length=inf, max_shapelets_to_store_per_class=200, time_contract_in_mins=60, num_candidates_to_sample_per_case=20, random_state=None, verbose=0, remove_self_similar=True)[source]

Bases: sktime.transformations.panel.shapelets.ShapeletTransform

Contracted Shapelet Transform. @incollection{bostrom2017binary,

title={Binary shapelet transform for multiclass time series classification}, author={Bostrom, Aaron and Bagnall, Anthony}, booktitle={Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXII}, pages={24–46}, year={2017}, publisher={Springer}

}

Parameters
  • min_shapelet_length (int, lower bound on candidate) –

  • lengths (default = 3) (shapelet) –

  • max_shapelet_length (int, upper bound on candidate) –

  • lengths (default = inf or series length) (shapelet) –

  • max_shapelets_to_store_per_class (int, upper bound on number of) –

  • to retain from each distinct class (default = 200) (shapelets) –

  • time_contract_in_mins (float, the number of minutes) –

  • for shapelet extraction (default = 60) (allowed) –

  • num_candidates_to_sample_per_case (int, number of candidate shapelets) –

  • assess per training series before moving on to (to) – the next series (default = 20)

  • random_state (RandomState, int, or none: to) –

  • random state objects for deterministic results (default = None) (control) –

  • verbose (int, level of output printed to) –

  • console (for information only) (default = 0) (the) –

  • remove_self_similar (boolean, remove overlapping) –

  • shapelets from the final transform (default = True) ("self-similar") –

predefined_ig_rejection_level[source]
Type

float, minimum information gain

required to keep a shapelet (default = 0.05)
self.shapelets[source]
Type

list of Shapelet objects,

the stored shapelets after a dataset has been processed
class sktime.transformations.panel.shapelets.Shapelet(series_id, start_pos, length, info_gain, data)[source]

Bases: object

A simple class to model a Shapelet with associated information

Parameters
  • series_id (int) – The index of the series within the data (X) that was passed to fit.

  • start_pos (int) – The starting position of the shapelet within the original series

  • length (int) – The length of the shapelet

  • info_gain (flaot) – The calculated information gain of this shapelet

  • data (array-like) – The (z-normalised) data of this shapelet

class sktime.transformations.panel.shapelets.ShapeletPQ[source]

Bases: object

get_array()[source]
get_size()[source]
peek()[source]
pop()[source]
push(shapelet)[source]
class sktime.transformations.panel.shapelets.ShapeletTransform(min_shapelet_length=3, max_shapelet_length=inf, max_shapelets_to_store_per_class=200, random_state=None, verbose=0, remove_self_similar=True)[source]

Bases: sktime.transformations.base._PanelToTabularTransformer

Shapelet Transform.

Original journal publication: @article{hills2014classification,

title={Classification of time series by shapelet transformation}, author={Hills, Jon and Lines, Jason and Baranauskas, Edgaras and Mapp, James and Bagnall, Anthony}, journal={Data Mining and Knowledge Discovery}, volume={28}, number={4}, pages={851–881}, year={2014}, publisher={Springer}

}

Parameters
  • min_shapelet_length (int, lower bound on candidate) –

  • lengths (default = 3) (shapelet) –

  • max_shapelet_length (int, upper bound on candidate) –

  • lengths (default = inf or series length) (shapelet) –

  • max_shapelets_to_store_per_class (int, upper bound on number of) –

  • to retain from each distinct class (default = 200) (shapelets) –

  • random_state (RandomState, int, or none: to) –

  • random state objects for deterministic results (default = None) (control) –

  • verbose (int, level of output printed to) –

  • console (for information only) (default = 0) (the) –

  • remove_self_similar (boolean, remove overlapping) –

  • shapelets from the final transform (default = True) ("self-similar") –

predefined_ig_rejection_level[source]
Type

float, minimum information gain

required to keep a shapelet (default = 0.05)
self.shapelets[source]
Type

list of Shapelet objects,

the stored shapelets after a dataset has been processed
static binary_entropy(num_this_class, num_other_class)[source]
static calc_binary_ig(orderline, total_num_this_class, total_num_other_class)[source]
static calc_early_binary_ig(orderline, num_this_class_in_orderline, num_other_class_in_orderline, num_to_add_this_class, num_to_add_other_class)[source]
static euclidean_distance_early_abandon(u, v, min_dist)[source]
fit(X, y=None)[source]

A method to fit the shapelet transform to a specified X and y

Parameters
  • X (pandas DataFrame) – The training input samples.

  • y (array-like or list) – The class values for X

Returns

self – This estimator

Return type

FullShapeletTransform

get_shapelets()[source]

An accessor method to return the extracted shapelets

Returns

shapelets

Return type

a list of Shapelet objects

static remove_self_similar_shapelets(shapelet_list)[source]

Remove self-similar shapelets from an input list. Note: this method assumes that shapelets are pre-sorted in descending order of quality (i.e. if two candidates are self-similar, the one with the later index will be removed)

Parameters

shapelet_list (list of Shapelet objects) –

Returns

shapelet_list

Return type

list of Shapelet objects

transform(X, y=None)[source]

Transforms X according to the extracted shapelets (self.shapelets)

Parameters

X (pandas DataFrame) – The input dataframe to transform

Returns

output – The transformed dataframe in tabular format.

Return type

pandas DataFrame

static zscore(a, axis=0, ddof=0)[source]

A static method to return the normalised version of series. This mirrors the scipy implementation with a small difference - rather than allowing /0, the function returns output = np.zeroes(len(input)). This is to allow for sensible processing of candidate shapelets/comparison subseries that are a straight line. Original version: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats .zscore.html

Parameters
  • a (array_like) – An array like object containing the sample data.

  • axis (int or None, optional) – Axis along which to operate. Default is 0. If None, compute over the whole array a.

  • ddof (int, optional) – Degrees of freedom correction in the calculation of the standard deviation. Default is 0.

Returns

zscore – The z-scores, standardized by mean and standard deviation of input array a.

Return type

array_like