autots.evaluator package

Submodules

autots.evaluator.auto_model module

Mid-level helper functions for AutoTS.

autots.evaluator.auto_model.ModelMonster(model: str, parameters: dict = {}, frequency: str = 'infer', prediction_interval: float = 0.9, holiday_country: str = 'US', startTimeStamps=None, forecast_length: int = 14, random_seed: int = 2020, verbose: int = 0)

Directs strings and parameters to appropriate model objects.

Parameters
  • model (str) – Name of Model Function

  • parameters (dict) – Dictionary of parameters to pass through to model

class autots.evaluator.auto_model.ModelObject(name: str = 'Uninitiated Model Name', frequency: str = 'infer', prediction_interval: float = 0.9, regression_type: str = None, fit_runtime=datetime.timedelta(0), holiday_country: str = 'US', random_seed: int = 2020, verbose: int = 0)

Bases: object

Generic class for holding forecasting models.

Models should all have methods:

.fit(df) (taking a DataFrame with DatetimeIndex and n columns of n timeseries) .predict(forecast_length = int, regressor) .get_new_params(method)

Parameters
  • name (str) – Model Name

  • frequency (str) – String alias of datetime index frequency or else ‘infer’

  • prediction_interval (float) – Confidence interval for probabilistic forecast

basic_profile(df)

Capture basic training details.

create_forecast_index(forecast_length: int)

Generate a pd.DatetimeIndex appropriate for a new forecast.

Warning

Requires ModelObject.basic_profile() being called as part of .fit()

get_new_params(method: str = 'random')

Return dict of new parameters for parameter tuning.

get_params()

Return dict of current parameters.

autots.evaluator.auto_model.ModelPrediction(df_train, forecast_length: int, transformation_dict: dict, model_str: str, parameter_dict: dict, frequency: str = 'infer', prediction_interval: float = 0.9, no_negatives: bool = False, constraint: float = None, future_regressor_train=[], future_regressor_forecast=[], holiday_country: str = 'US', startTimeStamps=None, grouping_ids=None, random_seed: int = 2020, verbose: int = 0)

Feed parameters into modeling pipeline

Parameters
  • df_train (pandas.DataFrame) – numeric training dataset of DatetimeIndex and series as cols

  • forecast_length (int) – number of periods to forecast

  • transformation_dict (dict) – a dictionary of outlier, fillNA, and transformation methods to be used

  • model_str (str) – a string to be direct to the appropriate model, used in ModelMonster

  • frequency (str) – str representing frequency alias of time series

  • prediction_interval (float) – width of errors (note: rarely do the intervals accurately match the % asked for…)

  • no_negatives (bool) – whether to force all forecasts to be > 0

  • constraint (float) – when not None, use this value * data st dev above max or below min for constraining forecast values.

  • future_regressor_train (pd.Series) – with datetime index, of known in advance data, section matching train data

  • future_regressor_forecast (pd.Series) – with datetime index, of known in advance data, section matching test data

  • holiday_country (str) – passed through to holiday package, used by a few models as 0/1 regressor.

  • startTimeStamps (pd.Series) – index (series_ids), columns (Datetime of First start of series)

Returns

Prediction from AutoTS model object

Return type

PredictionObject (autots.PredictionObject)

autots.evaluator.auto_model.NewGeneticTemplate(model_results, submitted_parameters, sort_column: str = 'smape_weighted', sort_ascending: bool = True, max_results: int = 50, max_per_model_class: int = 5, top_n: int = 50, template_cols: list = ['Model', 'ModelParameters', 'TransformationParameters', 'Ensemble'])

Return new template given old template with model accuracies.

Parameters
  • model_results (pandas.DataFrame) – models that have actually been run

  • submitted_paramters (pandas.DataFrame) – models tried (may have returned different parameters to results)

autots.evaluator.auto_model.PredictWitch(template, df_train, forecast_length: int, frequency: str = 'infer', prediction_interval: float = 0.9, no_negatives: bool = False, constraint: float = None, future_regressor_train=[], future_regressor_forecast=[], holiday_country: str = 'US', startTimeStamps=None, grouping_ids=None, random_seed: int = 2020, verbose: int = 0, template_cols: list = ['Model', 'ModelParameters', 'TransformationParameters', 'Ensemble'])

Takes numeric data, returns numeric forecasts. Only one model (albeit potentially an ensemble)!

Well, she turned me into a newt. A newt? I got better. -Python

Parameters
  • df_train (pandas.DataFrame) – numeric training dataset of DatetimeIndex and series as cols

  • forecast_length (int) – number of periods to forecast

  • transformation_dict (dict) – a dictionary of outlier, fillNA, and transformation methods to be used

  • model_str (str) – a string to be direct to the appropriate model, used in ModelMonster

  • frequency (str) – str representing frequency alias of time series

  • prediction_interval (float) – width of errors (note: rarely do the intervals accurately match the % asked for…)

  • no_negatives (bool) – whether to force all forecasts to be > 0

  • constraint (float) – when not None, use this value * data st dev above max or below min for constraining forecast values.

  • future_regressor_train (pd.Series) – with datetime index, of known in advance data, section matching train data

  • future_regressor_forecast (pd.Series) – with datetime index, of known in advance data, section matching test data

  • holiday_country (str) – passed through to holiday package, used by a few models as 0/1 regressor.

  • startTimeStamps (pd.Series) – index (series_ids), columns (Datetime of First start of series)

  • template_cols (list) – column names of columns used as model template

Returns

Prediction from AutoTS model object):

Return type

PredictionObject (autots.PredictionObject)

class autots.evaluator.auto_model.PredictionObject(model_name: str = 'Uninitiated', forecast_length: int = 0, forecast_index=nan, forecast_columns=nan, lower_forecast=nan, forecast=nan, upper_forecast=nan, prediction_interval: float = 0.9, predict_runtime=datetime.timedelta(0), fit_runtime=datetime.timedelta(0), model_parameters={}, transformation_parameters={}, transformation_runtime=datetime.timedelta(0))

Bases: object

Generic class for holding forecast information.

total_runtime()

Combine runtimes.

autots.evaluator.auto_model.RandomTemplate(n: int = 10, model_list: list = ['ZeroesNaive', 'LastValueNaive', 'AverageValueNaive', 'GLS', 'GLM', 'ETS', 'ARIMA', 'FBProphet', 'RollingRegression', 'GluonTS', 'UnobservedComponents', 'VARMAX', 'VECM', 'DynamicFactor'])

Returns a template dataframe of randomly generated transformations, models, and hyperparameters.

Parameters

n (int) – number of random models to return

class autots.evaluator.auto_model.TemplateEvalObject(model_results=Empty DataFrame Columns: [] Index: [], per_timestamp_smape=Empty DataFrame Columns: [] Index: [], per_series_mae=Empty DataFrame Columns: [] Index: [], per_series_spl=Empty DataFrame Columns: [] Index: [], per_series_rmse1=Empty DataFrame Columns: [] Index: [], per_series_rmse2=Empty DataFrame Columns: [] Index: [], model_count: int = 0)

Bases: object

Object to contain all your failures!.

concat(another_eval)

Merge another TemplateEvalObject onto this one.

save(filename)

Save results to a file.

autots.evaluator.auto_model.TemplateWizard(template, df_train, df_test, weights, model_count: int = 0, ensemble: str = True, forecast_length: int = 14, frequency: str = 'infer', prediction_interval: float = 0.9, no_negatives: bool = False, constraint: float = None, future_regressor_train=[], future_regressor_forecast=[], holiday_country: str = 'US', startTimeStamps=None, random_seed: int = 2020, verbose: int = 0, validation_round: int = 0, model_interrupt: bool = False, grouping_ids=None, template_cols: list = ['Model', 'ModelParameters', 'TransformationParameters', 'Ensemble'])

Take Template, returns Results.

There are some who call me… Tim. - Python

Parameters
  • template (pandas.DataFrame) – containing model str, and json of transformations and hyperparamters

  • df_train (pandas.DataFrame) – numeric training dataset of DatetimeIndex and series as cols

  • df_test (pandas.DataFrame) – dataframe of actual values of (forecast length * n series)

  • weights (dict) – key = column/series_id, value = weight

  • ensemble (str) – desc of ensemble types to prepare metric collection

  • forecast_length (int) – number of periods to forecast

  • transformation_dict (dict) – a dictionary of outlier, fillNA, and transformation methods to be used

  • model_str (str) – a string to be direct to the appropriate model, used in ModelMonster

  • frequency (str) – str representing frequency alias of time series

  • prediction_interval (float) – width of errors (note: rarely do the intervals accurately match the % asked for…)

  • no_negatives (bool) – whether to force all forecasts to be > 0

  • constraint (float) – when not None, use this value * data st dev above max or below min for constraining forecast values.

  • future_regressor_train (pd.Series) – with datetime index, of known in advance data, section matching train data

  • future_regressor_forecast (pd.Series) – with datetime index, of known in advance data, section matching test data

  • holiday_country (str) – passed through to holiday package, used by a few models as 0/1 regressor.

  • startTimeStamps (pd.Series) – index (series_ids), columns (Datetime of First start of series)

  • validation_round (int) – int passed to record current validation.

  • model_interrupt (bool) – if True, keyboard interrupts are caught and only break current model eval.

  • template_cols (list) – column names of columns used as model template

Returns

TemplateEvalObject

autots.evaluator.auto_model.UniqueTemplates(existing_templates, new_possibilities, selection_cols: list = ['Model', 'ModelParameters', 'TransformationParameters', 'Ensemble'])

Returns unique dataframe rows from new_possiblities not in existing_templates.

Parameters

selection_cols (list) – list of column namess to use to judge uniqueness/match on

autots.evaluator.auto_model.create_model_id(model_str: str, parameter_dict: dict = {}, transformation_dict: dict = {})

Create a hash ID which should be unique to the model parameters.

autots.evaluator.auto_model.dict_recombination(a: dict, b: dict)

Recombine two dictionaries with identical keys. Return new dict.

autots.evaluator.auto_model.generate_score(model_results, metric_weighting: dict = {}, prediction_interval: float = 0.9)

Generate score based on relative accuracies.

autots.evaluator.auto_model.seasonal_int(include_one: bool = False)

Generate a random integer of typical seasonalities.

autots.evaluator.auto_model.trans_dict_recomb(dict_array)

Recombine two transformation param dictionaries from array of dicts.

autots.evaluator.auto_model.unpack_ensemble_models(template, template_cols: list = ['Model', 'ModelParameters', 'TransformationParameters', 'Ensemble'], keep_ensemble: bool = True, recursive: bool = False)

Take ensemble models from template and add as new rows.

autots.evaluator.auto_model.validation_aggregation(validation_results)

Aggregate a TemplateEvalObject.

autots.evaluator.auto_ts module

Higher-level backbone of auto time series modeling.

class autots.evaluator.auto_ts.AutoTS(forecast_length: int = 14, frequency: str = 'infer', prediction_interval: float = 0.9, max_generations: int = 5, no_negatives: bool = False, constraint: float = None, ensemble: str = 'simple', initial_template: str = 'General+Random', random_seed: int = 2020, holiday_country: str = 'US', subset: int = None, aggfunc: str = 'first', na_tolerance: float = 1, metric_weighting: dict = {'containment_weighting': 0, 'contour_weighting': 0, 'mae_weighting': 2, 'rmse_weighting': 2, 'runtime_weighting': 0, 'smape_weighting': 10, 'spl_weighting': 1}, drop_most_recent: int = 0, drop_data_older_than_periods: int = 100000, model_list: str = 'default', num_validations: int = 2, models_to_validate: float = 0.15, max_per_model_class: int = None, validation_method: str = 'even', min_allowed_train_percent: float = 0.5, remove_leading_zeroes: bool = False, model_interrupt: bool = False, verbose: int = 1)

Bases: object

Automate time series modeling using a genetic algorithm.

Parameters
  • forecast_length (int) – number of periods over which to evaluate forecast. Can be overriden later in .predict().

  • frequency (str) – ‘infer’ or a specific pandas datetime offset. Can be used to force rollup of data (ie daily input, but frequency ‘M’ will rollup to monthly).

  • prediction_interval (float) – 0-1, uncertainty range for upper and lower forecasts. Adjust range, but rarely matches actual containment.

  • max_generations (int) – number of genetic algorithms generations to run. More runs = longer runtime, generally better accuracy.

  • no_negatives (bool) – if True, all negative predictions are rounded up to 0.

  • constraint (float) – when not None, use this value * data st dev above max or below min for constraining forecast values. Applied to point forecast only, not upper/lower forecasts.

  • ensemble (str) – None, ‘simple’, ‘distance’

  • initial_template (str) – ‘Random’ - randomly generates starting template, ‘General’ uses template included in package, ‘General+Random’ - both of previous. Also can be overriden with self.import_template()

  • random_seed (int) – random seed allows (slightly) more consistent results.

  • holiday_country (str) – passed through to Holidays package for some models.

  • subset (int) – maximum number of series to evaluate at once. Useful to speed evaluation when many series are input.

  • aggfunc (str) – if data is to be rolled up to a higher frequency (daily -> monthly) or duplicate timestamps are included. Default ‘first’ removes duplicates, for rollup try ‘mean’ or np.sum. Beware numeric aggregations like ‘mean’ will not work with non-numeric inputs.

  • na_tolerance (float) – 0 to 1. Series are dropped if they have more than this percent NaN. 0.95 here would allow series containing up to 95% NaN values.

  • metric_weighting (dict) – weights to assign to metrics, effecting how the ranking score is generated.

  • drop_most_recent (int) – option to drop n most recent data points. Useful, say, for monthly sales data where the current (unfinished) month is included.

  • drop_data_older_than_periods (int) – take only the n most recent timestamps

  • model_list (list) – list of names of model objects to use

  • num_validations (int) – number of cross validations to perform. 0 for just train/test on final split.

  • models_to_validate (int) – top n models to pass through to cross validation. Or float in 0 to 1 as % of tried. 0.99 is forced to 100% validation. 1 evaluates just 1 model. If horizontal or probabilistic ensemble, then additional min per_series models above the number here may be added to validation.

  • max_per_model_class (int) – of the models_to_validate what is the maximum to pass from any one model class/family.

  • validation_method (str) – ‘even’, ‘backwards’, or ‘seasonal n’ where n is an integer of seasonal ‘backwards’ is better for recency and for shorter training sets ‘even’ splits the data into equally-sized slices best for more consistent data ‘seasonal n’ for example ‘seasonal 364’ would test all data on each previous year of the forecast_length that would immediately follow the training data.

  • min_allowed_train_percent (float) – percent of forecast length to allow as min training, else raises error. 0.5 with a forecast length of 10 would mean 5 training points are mandated, for a total of 15 points. Useful in (unrecommended) cases where forecast_length > training length.

  • remove_leading_zeroes (bool) – replace leading zeroes with NaN. Useful in data where initial zeroes mean data collection hasn’t started yet.

  • model_interrupt (bool) – if False, KeyboardInterrupts quit entire program. if True, KeyboardInterrupts attempt to only quit current model. if True, recommend use in conjunction with verbose > 0 and result_file in the event of accidental complete termination.

  • verbose (int) – setting to 0 or lower should reduce most output. Higher numbers give more output.

best_model

DataFrame containing template for the best ranked model

Type

pandas.DataFrame

regression_check

If True, the best_model uses an input ‘User’ future_regressor

Type

bool

export_template(filename, models: str = 'best', n: int = 5, max_per_model_class: int = None, include_results: bool = False)

Export top results as a reusable template.

Parameters
  • filename (str) – ‘csv’ or ‘json’ (in filename). None to return a dataframe and not write a file.

  • models (str) – ‘best’ or ‘all’

  • n (int) – if models = ‘best’, how many n-best to export

  • max_per_model_class (int) – if models = ‘best’, the max number of each model class to include in template

  • include_results (bool) – whether to include performance metrics

fit(df, date_col: str = None, value_col: str = None, id_col: str = None, future_regressor=[], weights: dict = {}, result_file: str = None, grouping_ids=None)

Train algorithm given data supplied.

Parameters
  • df (pandas.DataFrame) – Datetime Indexed dataframe of series, or dataframe of three columns as below.

  • date_col (str) – name of datetime column

  • value_col (str) – name of column containing the data of series.

  • id_col (str) – name of column identifying different series.

  • future_regressor (numpy.Array) – single external regressor matching train.index

  • weights (dict) – {‘colname1’: 2, ‘colname2’: 5} - increase importance of a series in metric evaluation. Any left blank assumed to have weight of 1.

  • result_file (str) – results saved on each new generation. Does not include validation rounds. “.csv” save model results table. “.pickle” saves full object, including ensemble information.

  • grouping_ids (dict) – currently a one-level dict containing series_id:group_id mapping.

import_results(filename)

Add results from another run on the same data.

Input can be filename with .csv or .pickle. or can be a DataFrame of model results or a full TemplateEvalObject

import_template(filename: str, method: str = 'Add On', enforce_model_list: bool = True)

Import a previously exported template of model parameters. Must be done before the AutoTS object is .fit().

Parameters
  • filename (str) – file location (or a pd.DataFrame already loaded)

  • method (str) – ‘Add On’ or ‘Only’

  • enforce_model_list (bool) – if True, remove model types not in model_list

predict(forecast_length: int = 'self', prediction_interval: float = 'self', future_regressor=[], hierarchy=None, just_point_forecast: bool = False, verbose: int = 'self')

Generate forecast data immediately following dates of index supplied to .fit().

Parameters
  • forecast_length (int) – Number of periods of data to forecast ahead

  • prediction_interval (float) – interval of upper/lower forecasts. defaults to ‘self’ ie the interval specified in __init__() if prediction_interval is a list, then returns a dict of forecast objects.

  • future_regressor (numpy.Array) – additional regressor, not used

  • hierarchy – Not yet implemented

  • just_point_forecast (bool) – If True, return a pandas.DataFrame of just point forecasts

Returns

Either a PredictionObject of forecasts and metadata, or if just_point_forecast == True, a dataframe of point forecasts

results(result_set: str = 'initial')

Convenience function to return tested models table.

Parameters

result_set (str) – ‘validation’ or ‘initial’

class autots.evaluator.auto_ts.AutoTSIntervals

Bases: object

Autots looped to test multiple prediction intervals. Experimental.

Runs max_generations on first prediction interval, then validates on remainder. Most args are passed through to AutoTS().

Parameters
  • interval_models_to_validate (int) – number of models to validate on each prediction interval.

  • import_results (str) – results from run on same data to load, filename.pickle. Currently result_file and import only save/load initial run, no validations.

fit(prediction_intervals, forecast_length, df_long, max_generations, num_validations, validation_method, models_to_validate, interval_models_to_validate, date_col, value_col, id_col=None, import_template=None, import_method='only', import_results=None, result_file=None, model_list='all', metric_weighting: dict = {'containment_weighting': 0, 'contour_weighting': 0, 'mae_weighting': 0, 'rmse_weighting': 1, 'runtime_weighting': 0, 'smape_weighting': 1, 'spl_weighting': 10}, weights: dict = {}, grouping_ids=None, future_regressor=[], model_interrupt: bool = False, constraint=2, no_negatives=False, remove_leading_zeroes=False, random_seed=2020)

Train and find best.

predict(future_regressor=[], verbose: int = 'self') → dict

Generate forecasts after training complete.

autots.evaluator.auto_ts.error_correlations(all_result, result: str = 'corr')

Onehot encode AutoTS result df and return df or correlation with errors.

Parameters
  • all_results (pandas.DataFrame) – AutoTS model_results df

  • result (str) – whether to return ‘df’, ‘corr’, ‘poly corr’ with errors

autots.evaluator.auto_ts.fake_regressor(df_long, forecast_length: int = 14, date_col: str = 'datetime', value_col: str = 'value', id_col: str = 'series_id', frequency: str = 'infer', aggfunc: str = 'first', drop_most_recent: int = 0, na_tolerance: float = 0.95, drop_data_older_than_periods: int = 100000, dimensions: int = 1)

Create a fake regressor of random numbers for testing purposes.

autots.evaluator.metrics module

Tools for calculating forecast errors.

class autots.evaluator.metrics.EvalObject(model_name: str = 'Uninitiated', per_series_metrics=nan, per_timestamp=nan, avg_metrics=nan, avg_metrics_weighted=nan)

Bases: object

Object to contain all your failures!.

autots.evaluator.metrics.PredictionEval(PredictionObject, actual, series_weights: dict = {}, df_train=nan, per_timestamp_errors: bool = False, dist_n: int = None)

Evalute prediction against test actual.

Parameters
  • PredictionObject (autots.PredictionObject) – Prediction from AutoTS model object

  • actual (pd.DataFrame) – dataframe of actual values of (forecast length * n series)

  • series_weights (dict) – key = column/series_id, value = weight

  • per_timestamp (bool) – whether to calculate and return per timestamp direction errors

  • dist_n (int) – if not None, calculates two part rmse on head(n) and tail(remainder) of forecast.

autots.evaluator.metrics.SPL(A, F, df_train, quantile)

Scaled pinball loss.

autots.evaluator.metrics.containment(lower_forecast, upper_forecast, actual)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

autots.evaluator.metrics.contour(A, F)

A measure of how well the actual and forecast follow the same pattern of change. Note: If actual values are unchanging, will match positive changing forecasts. Expects two, 2-D numpy arrays of forecast_length * n series Returns a 1-D array of results in len n series

Parameters
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.mae(A, F)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters
  • A (numpy.array) – known true values

  • F (numpy.array) – predicted values

autots.evaluator.metrics.pinball_loss(A, F, quantile)

Bigger is bad-er.

autots.evaluator.metrics.rmse(actual, forecast)

Expects two, 2-D numpy arrays of forecast_length * n series.

Returns a 1-D array of results in len n series

Parameters
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

autots.evaluator.metrics.smape(actual, forecast)

Expect two, 2-D numpy arrays of forecast_length * n series. Allows NaN in actuals, and corresponding NaN in forecast, but not unmatched NaN in forecast Also doesn’t like zeroes in either forecast or actual - results in poor error value even if forecast is accurate

Returns a 1-D array of results in len n series

Parameters
  • actual (numpy.array) – known true values

  • forecast (numpy.array) – predicted values

References

https://en.wikipedia.org/wiki/Symmetric_mean_absolute_percentage_error

Module contents

Model Evaluators