abacusai.model

Module Contents

Classes

Model

A model

class abacusai.model.Model(client, name=None, modelId=None, modelConfig=None, modelPredictionConfig=None, createdAt=None, projectId=None, shared=None, sharedAt=None, trainFunctionName=None, predictFunctionName=None, predictManyFunctionName=None, initializeFunctionName=None, trainingInputTables=None, sourceCode=None, cpuSize=None, memory=None, trainingFeatureGroupIds=None, isPythonModel=None, defaultAlgorithm=None, customAlgorithmConfigs=None, location={}, refreshSchedules={}, codeSource={}, latestModelVersion={})

Bases: abacusai.return_class.AbstractApiClass

A model

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • name (str) – The user-friendly name for the model.

  • modelId (str) – The unique identifier of the model.

  • modelConfig (dict) – The training config options used to train this model.

  • modelPredictionConfig (dict) – The prediction config options for the model.

  • createdAt (str) – Date and time at which the model was created.

  • projectId (str) – The project this model belongs to.

  • shared (bool) – If model is shared to the Abacus.AI model showcase.

  • sharedAt (str) – The date and time at which the model was shared to the model showcase

  • trainFunctionName (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.

  • predictFunctionName (str) – Name of the function found in the source code that will be executed run predictions through model. It is not executed when this function is run.

  • predictManyFunctionName (str) – Name of the function found in the source code that will be executed to run batch predictions trhough the model.

  • initializeFunctionName (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model

  • trainingInputTables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • sourceCode (str) – Python code used to make the model.

  • cpuSize (str) – Cpu size specified for the python model training.

  • memory (int) – Memory in GB specified for the python model training.

  • trainingFeatureGroupIds (list of unique string identifiers) – The unique identifiers of the feature groups used as the inputs to train this model on.

  • isPythonModel (bool) – If this model is handled as python model

  • defaultAlgorithm (str) – If set, this algorithm will always be used when deploying the model regardless of the model metrics

  • customAlgorithmConfigs (dict) – User-defined configs for each of the user-defined custom algorithm

  • latestModelVersion (ModelVersion) – The latest model version.

  • location (ModelLocation) – Location information for models that are imported.

  • refreshSchedules (RefreshSchedule) – List of refresh schedules that indicate when the next model version will be trained

  • codeSource (CodeSource) – If a python model, information on the source code

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

describe_train_test_data_split_feature_group()

Get the train and test data split for a trained model by model id. Only supported for models with custom algorithms.

Parameters:

model_id (str) – The unique ID of the model. By default will return for latest model version if version is not specified.

Returns:

The feature group containing the training data and folds information.

Return type:

FeatureGroup

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

Model

describe()

Retrieves a full description of the specified model.

Parameters:

model_id (str) – The unique ID associated with the model.

Returns:

The description of the model.

Return type:

Model

rename(name)

Renames a model

Parameters:

name (str) – The name to apply to the model

update_python(function_source_code=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, initialize_function_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None)

Updates an existing python Model using user provided Python code. If a list of input feature groups are supplied,

we will provide as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects functionSourceCode to be a valid language source file which contains the functions named trainFunctionName and predictFunctionName. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName and predictFunctionName has no well defined return type, as it returns the prediction made by the predictFunctionName, which can be anything

Parameters:
  • function_source_code (str) – Contents of a valid python source code file. The source code should contain the functions named trainFunctionName and predictFunctionName. A list of allowed import and system libraries for each language is specified in the user functions documentation section.

  • train_function_name (str) – Name of the function found in the source code that will be executed to train the model. It is not executed when this function is run.

  • predict_function_name (str) – Name of the function found in the source code that will be executed run predictions through model. It is not executed when this function is run.

  • predict_many_function_name (str) – Name of the function found in the source code that will be executed to run batch predictions through model. It is not executed when this function is run.

  • initialize_function_name (str) – Name of the function found in the source code to initialize the trained model before using it to make predictions using the model

  • training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the model training function

  • memory (int) – Memory (in GB) for the model training function

  • package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency

Returns:

The updated model

Return type:

Model

update_python_zip(train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None, package_requirements=None)

Updates an existing python Model using a provided zip file. If a list of input feature groups are supplied,

we will provide as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contains the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName and predictFunctionName has no well defined return type, as it returns the prediction made by the predictFunctionName, which can be anything

Parameters:
  • train_function_name (str) – Name of the function found in train module that will be executed to train the model. It is not executed when this function is run.

  • predict_function_name (str) – Name of the function found in the predict module that will be executed run predictions through model. It is not executed when this function is run.

  • predict_many_function_name (str) – Name of the function found in the predict module that will be executed run batch predictions through model. It is not executed when this function is run.

  • train_module_name (str) – Full path of the module that contains the train function from the root of the zip.

  • predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.

  • training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the model training function

  • memory (int) – Memory (in GB) for the model training function

  • package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency

Returns:

The updated model

Return type:

Upload

update_python_git(application_connector_id=None, branch_name=None, python_root=None, train_function_name=None, predict_function_name=None, predict_many_function_name=None, train_module_name=None, predict_module_name=None, training_input_tables=None, cpu_size=None, memory=None)

Updates an existing python Model using an existing git application connector. If a list of input feature groups are supplied,

we will provide as arguments to the train and predict functions with the materialized feature groups for those input feature groups.

This method expects trainModuleName and predictModuleName to be valid language source files which contains the functions named trainFunctionName and predictFunctionName, respectively. trainFunctionName returns the ModelVersion that is the result of training the model using trainFunctionName and predictFunctionName has no well defined return type, as it returns the prediction made by the predictFunctionName, which can be anything

Parameters:
  • application_connector_id (str) – The unique ID associated with the git application connector.

  • branch_name (str) – Name of the branch in the git repository to be used for training.

  • python_root (str) – Path from the top level of the git repository to the directory containing the Python source code. If not provided, the default is the root of the git repository.

  • train_function_name (str) – Name of the function found in train module that will be executed to train the model. It is not executed when this function is run.

  • predict_function_name (str) – Name of the function found in the predict module that will be executed run predictions through model. It is not executed when this function is run.

  • predict_many_function_name (str) – Name of the function found in the predict module that will be executed run batch predictions through model. It is not executed when this function is run.

  • train_module_name (str) – Full path of the module that contains the train function from the root of the zip.

  • predict_module_name (str) – Full path of the module that contains the predict function from the root of the zip.

  • training_input_tables (list) – List of feature groups that are supplied to the train function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).

  • cpu_size (str) – Size of the cpu for the model training function

  • memory (int) – Memory (in GB) for the model training function

Returns:

The updated model

Return type:

Model

set_training_config(training_config)

Edits the default model training config

Parameters:

training_config (dict) – The training config key/value pairs used to train this model.

Returns:

The model object correspoding after the training config is applied

Return type:

Model

set_prediction_params(prediction_config)

Sets the model prediction config for the model

Parameters:

prediction_config (dict) – The prediction config for the model

Returns:

The model object correspoding after the prediction config is applied

Return type:

Model

get_metrics(model_version=None, baseline_metrics=False)

Retrieves a full list of the metrics for the specified model.

If only the model’s unique identifier (modelId) is specified, the latest trained version of model (modelVersion) is used.

Parameters:
  • model_version (str) – The version of the model.

  • baseline_metrics (bool) – If true, will also return the baseline model metrics for comparison.

Returns:

An object to show the model metrics and explanations for what each metric means.

Return type:

ModelMetrics

list_versions(limit=100, start_after_version=None)

Retrieves a list of the version for a given model.

Parameters:
  • limit (int) – The max length of the list of all dataset versions.

  • start_after_version (str) – The id of the version after which the list starts.

Returns:

An array of model versions.

Return type:

ModelVersion

retrain(deployment_ids=[], feature_group_ids=None, custom_algorithm_configs=None, cpu_size=None, memory=None, training_config=None)

Retrains the specified model. Gives you an option to choose the deployments you want the retraining to be deployed to.

Parameters:
  • deployment_ids (list) – List of deployments to automatically deploy to.

  • feature_group_ids (list) – List of feature group ids provided by the user to train the model on.

  • custom_algorithm_configs (dict) – The user-defined training configs for each custom algorithm.

  • cpu_size (str) – Size of the cpu for the user-defined algorithms during train.

  • memory (int) – Memory (in GB) for the user-defined algorithms during train.

  • training_config (dict) – The training config key/value pairs used to train this model.

Returns:

The model that is being retrained.

Return type:

Model

delete()

Deletes the specified model and all its versions. Models which are currently used in deployments cannot be deleted.

Parameters:

model_id (str) – The ID of the model to delete.

set_default_algorithm(algorithm=None)

Sets the model’s algorithm to default for all new deployments

Parameters:
  • model_id (Unique String Identifier) – The model to set

  • algorithm (str) – the algorithm to pin in the model

  • algorithm

wait_for_training(timeout=None)

A waiting call until model is trained.

Parameters:

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_evaluation(timeout=None)

A waiting call until model is evaluated completely.

Parameters:

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_full_automl(timeout=None)

A waiting call until full AutoML cycle is completed.

Parameters:

timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status(get_automl_status=False)

Gets the status of the model training.

Returns:

A string describing the status of a model training (pending, complete, etc.).

Return type:

str

Parameters:

get_automl_status (bool) –

create_refresh_policy(cron)

To create a refresh policy for a model.

Parameters:

cron (str) – A cron style string to set the refresh time.

Returns:

The refresh policy object.

Return type:

RefreshPolicy

list_refresh_policies()

Gets the refresh policies in a list.

Returns:

A list of refresh policy objects.

Return type:

List[RefreshPolicy]

get_train_test_feature_group_as_pandas()

Get the model train test data split feature group as pandas.

Returns:

A pandas dataframe for the training data with fold column.

Return type:

pandas.Dataframe