abacusai.feature_group
Module Contents
Classes
|
- class abacusai.feature_group.FeatureGroup(client, modificationLock=None, featureGroupId=None, name=None, featureGroupSourceType=None, tableName=None, sql=None, datasetId=None, functionSourceCode=None, functionName=None, sourceTables=None, createdAt=None, description=None, featureGroupType=None, sqlError=None, latestVersionOutdated=None, referencedFeatureGroups=None, tags=None, primaryKey=None, updateTimestampKey=None, lookupKeys=None, streamingEnabled=None, featureGroupUse=None, incremental=None, mergeConfig=None, transformConfig=None, samplingConfig=None, cpuSize=None, memory=None, streamingReady=None, featureTags=None, moduleName=None, templateBindings=None, featureExpression=None, useOriginalCsvNames=None, pythonFunctionBindings=None, pythonFunctionName=None, annotationConfig=None, features={}, duplicateFeatures={}, pointInTimeGroups={}, concatenationConfig={}, indexingConfig={}, codeSource={}, featureGroupTemplate={}, latestFeatureGroupVersion={})
Bases:
abacusai.return_class.AbstractApiClass
- Parameters:
client (ApiClient) – An authenticated API Client instance
modificationLock (bool) –
featureGroupId (str) –
name (str) –
featureGroupSourceType (str) –
tableName (str) –
sql (str) –
datasetId (str) –
functionSourceCode (str) –
functionName (str) –
sourceTables (list) –
createdAt (str) –
description (str) –
featureGroupType (str) –
sqlError (str) –
latestVersionOutdated (bool) –
referencedFeatureGroups (list) –
tags (list) –
primaryKey (str) –
updateTimestampKey (str) –
lookupKeys (list) –
streamingEnabled (bool) –
featureGroupUse (str) –
incremental (bool) –
mergeConfig (dict) –
transformConfig (dict) –
samplingConfig (dict) –
cpuSize (str) –
memory (number(integer)) –
streamingReady (bool) –
featureTags (dict) –
moduleName (str) –
templateBindings (list) –
featureExpression (str) –
useOriginalCsvNames (bool) –
pythonFunctionBindings (list) –
pythonFunctionName (str) –
annotationConfig (dict) –
features (Feature) –
duplicateFeatures (Feature) –
pointInTimeGroups (PointInTimeGroup) –
latestFeatureGroupVersion (FeatureGroupVersion) –
concatenationConfig (ConcatenationConfig) –
indexingConfig (IndexingConfig) –
codeSource (CodeSource) –
featureGroupTemplate (FeatureGroupTemplate) –
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- add_to_project(project_id, feature_group_type='CUSTOM_TABLE', feature_group_use=None)
Adds a feature group to a project,
- Parameters:
project_id (str) – The unique ID associated with the project.
feature_group_type (str) – The feature group type of the feature group. The type is based on the use case under which the feature group is being created. For example, Catalog Attributes can be a feature group type under personalized recommendation use case.
feature_group_use (str) – The user assigned feature group use which allows for organizing project feature groups DATA_WRANGLING, TRAINING_INPUT, BATCH_PREDICTION_INPUT
- remove_from_project(project_id)
Removes a feature group from a project.
- Parameters:
project_id (str) – The unique ID associated with the project.
- set_type(project_id, feature_group_type='CUSTOM_TABLE')
Update the feature group type in a project. The feature group must already be added to the project.
- Parameters:
project_id (str) – The unique ID associated with the project.
feature_group_type (str) – The feature group type to set the feature group as. The type is based on the use case under which the feature group is being created. For example, Catalog Attributes can be a feature group type under personalized recommendation use case.
- use_for_training(project_id, use_for_training=True)
Use the feature group for model training input
- describe_annotation(feature_name=None, doc_id=None, feature_group_row_identifier=None)
Get the latest annotation entry for a given feature group, feature, and document.
- Parameters:
feature_name (str) – The name of the feature the annotation is on.
doc_id (str) – The ID of the primary document the annotation is on.
feature_group_row_identifier (str) – The key value of the feature group row the annotation is on (cast to string). Usually the primary key value. At least one of the doc_id or key value must be provided so that the correct annotation can be identified.
- Returns:
The latest annotation entry for the given feature group, feature, and document and/or annotation key value
- Return type:
- create_sampling(table_name, sampling_config, description=None)
Creates a new feature group defined as a sample of rows from another feature group.
For efficiency, sampling is approximate unless otherwise specified. (E.g. the number of rows may vary slightly from what was requested).
- Parameters:
- Returns:
The created feature group.
- Return type:
- set_sampling_config(sampling_config)
Set a FeatureGroup’s sampling to the config values provided, so that the rows the FeatureGroup returns will be a sample of those it would otherwise have returned.
Currently, sampling is only for Sampling FeatureGroups, so this API only allows calling on that kind of FeatureGroup.
- Parameters:
sampling_config (dict) – A json object string specifying the sampling method and parameters specific to that sampling method. Empty sampling_config means no sampling.
- Returns:
The updated feature group.
- Return type:
- set_merge_config(merge_config)
Set a MergeFeatureGroup’s merge config to the values provided, so that the feature group only returns a bounded range of an incremental dataset.
- Parameters:
merge_config (dict) – A json object string specifying the merge rule. An empty mergeConfig will default to only including the latest Dataset Version.
- set_transform_config(transform_config)
Set a TransformFeatureGroup’s transform config to the values provided.
- Parameters:
transform_config (dict) – A json object string specifying the pre-defined transformation.
- set_schema(schema)
Creates a new schema and points the feature group to the new feature group schema id.
- Parameters:
schema (list) – An array of json objects with ‘name’ and ‘dataType’ properties.
- get_schema(project_id=None)
Returns a schema given a specific FeatureGroup in a project.
- create_feature(name, select_expression)
Creates a new feature in a Feature Group from a SQL select statement
- Parameters:
- Returns:
A feature group object with the newly added feature.
- Return type:
- add_tag(tag)
Adds a tag to the feature group
- Parameters:
tag (str) – The tag to add to the feature group
- remove_tag(tag)
Removes a tag from the feature group
- Parameters:
tag (str) – The tag to add to the feature group
- add_annotatable_feature(name, annotation_type)
- Parameters:
- Returns:
None
- Return type:
- set_feature_as_annotatable_feature(feature_name, annotation_type, feature_group_row_identifier_feature=None, doc_id_feature=None)
- Parameters:
- Returns:
None
- Return type:
- unset_feature_as_annotatable_feature(feature_name)
- Parameters:
feature_name (str) –
- Returns:
None
- Return type:
- add_annotation_label(label_name, annotation_type, label_definition=None)
- Parameters:
- Returns:
None
- Return type:
- create_nested_feature(nested_feature_name, table_name, using_clause, where_clause=None, order_clause=None)
Creates a new nested feature in a feature group from a SQL statements to create a new nested feature.
- Parameters:
nested_feature_name (str) – The name of the feature.
table_name (str) – The table name of the feature group to nest
using_clause (str) – The SQL join column or logic to join the nested table with the parent
where_clause (str) – A SQL where statement to filter the nested rows
order_clause (str) – A SQL clause to order the nested rows
- Returns:
A feature group object with the newly added nested feature.
- Return type:
- update_nested_feature(nested_feature_name, table_name=None, using_clause=None, where_clause=None, order_clause=None, new_nested_feature_name=None)
Updates a previously existing nested feature in a feature group.
- Parameters:
nested_feature_name (str) – The name of the feature to be updated.
table_name (str) – The name of the table.
using_clause (str) – The SQL join column or logic to join the nested table with the parent
where_clause (str) – A SQL where statement to filter the nested rows
order_clause (str) – A SQL clause to order the nested rows
new_nested_feature_name (str) – New name for the nested feature.
- Returns:
A feature group object with the updated nested feature.
- Return type:
- delete_nested_feature(nested_feature_name)
Delete a nested feature.
- Parameters:
nested_feature_name (str) – The name of the feature to be updated.
- Returns:
A feature group object without the deleted nested feature.
- Return type:
- create_point_in_time_feature(feature_name, history_table_name, aggregation_keys, timestamp_key, historical_timestamp_key, expression, lookback_window_seconds=None, lookback_window_lag_seconds=0, lookback_count=None, lookback_until_position=0)
Creates a new point in time feature in a feature group using another historical feature group, window spec and aggregate expression.
We use the aggregation keys, and either the lookbackWindowSeconds or the lookbackCount values to perform the window aggregation for every row in the current feature group. If the window is specified in seconds, then all rows in the history table which match the aggregation keys and with historicalTimeFeature >= lookbackStartCount and < the value of the current rows timeFeature are considered. An option lookbackWindowLagSeconds (+ve or -ve) can be used to offset the current value of the timeFeature. If this value is negative, we will look at the future rows in the history table, so care must be taken to make sure that these rows are available in the online context when we are performing a lookup on this feature group. If window is specified in counts, then we order the historical table rows aligning by time and consider rows from the window where the rank order is >= lookbackCount and includes the row just prior to the current one. The lag is specified in term of positions using lookbackUntilPosition.
- Parameters:
feature_name (str) – The name of the feature to create
history_table_name (str) – The table name of the history table.
aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.
timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature
historical_timestamp_key (str) – Name of feature which contains the historical timestamp.
expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
- Returns:
A feature group object with the newly added nested feature.
- Return type:
- update_point_in_time_feature(feature_name, history_table_name=None, aggregation_keys=None, timestamp_key=None, historical_timestamp_key=None, expression=None, lookback_window_seconds=None, lookback_window_lag_seconds=None, lookback_count=None, lookback_until_position=None, new_feature_name=None)
Updates an existing point in time feature in a feature group. See createPointInTimeFeature for detailed semantics.
- Parameters:
feature_name (str) – The name of the feature.
history_table_name (str) – The table name of the history table. If not specified, we use the current table to do a self join.
aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.
timestamp_key (str) – Name of feature which contains the timestamp value for the point in time feature
historical_timestamp_key (str) – Name of feature which contains the historical timestamp.
expression (str) – SQL Aggregate expression which can convert a sequence of rows into a scalar value.
lookback_window_seconds (float) – If window is specified in terms of time, number of seconds in the past from the current time for start of the window.
lookback_window_lag_seconds (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
new_feature_name (str) – New name for the point in time feature.
- Returns:
A feature group object with the newly added nested feature.
- Return type:
- create_point_in_time_group(group_name, window_key, aggregation_keys, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=0, lookback_count=None, lookback_until_position=0)
Create point in time group
- Parameters:
group_name (str) – The name of the point in time group
window_key (str) – Name of feature to use for ordering the rows on the source table
aggregation_keys (list) – List of keys to perform on the source table for the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used
history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys
lookback_window (float) – Number of seconds in the past from the current time for start of the window. If 0, the lookback will include all rows.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
- Returns:
The feature group after the point in time group has been created
- Return type:
- update_point_in_time_group(group_name, window_key=None, aggregation_keys=None, history_table_name=None, history_window_key=None, history_aggregation_keys=None, lookback_window=None, lookback_window_lag=None, lookback_count=None, lookback_until_position=None)
Update point in time group
- Parameters:
group_name (str) – The name of the point in time group
window_key (str) – Name of feature which contains the timestamp value for the point in time feature
aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation.
history_table_name (str) – The table to use for aggregating, if not provided, the source table will be used
history_window_key (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used
history_aggregation_keys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys
lookback_window (float) – Number of seconds in the past from the current time for start of the window.
lookback_window_lag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookback_count (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookback_until_position (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
- Returns:
The feature group after the update has been applied
- Return type:
- delete_point_in_time_group(group_name)
Delete point in time group
- Parameters:
group_name (str) – The name of the point in time group
- Returns:
The feature group after the point in time group has been deleted
- Return type:
- create_point_in_time_group_feature(group_name, name, expression)
Create point in time group feature
- Parameters:
- Returns:
The feature group after the update has been applied
- Return type:
- update_point_in_time_group_feature(group_name, name, expression)
Update a feature’s SQL expression in a point in time group
- Parameters:
- Returns:
The feature group after the update has been applied
- Return type:
- set_feature_type(feature, feature_type)
Set a feature’s type in a feature group/. Specify the feature group ID, feature name and feature type, and the method will return the new column with the resulting changes reflected.
- Parameters:
feature (str) – The name of the feature.
feature_type (str) – The machine learning type of the data in the feature. CATEGORICAL, CATEGORICAL_LIST, NUMERICAL, TIMESTAMP, TEXT, EMAIL, LABEL_LIST, JSON, OBJECT_REFERENCE, MULTICATEGORICAL_LIST, COORDINATE_LIST, NUMERICAL_LIST, TIMESTAMP_LIST Refer to the (guide on feature types)[https://api.abacus.ai/app/help/class/FeatureType] for more information. Note: Some FeatureMappings will restrict the options or explicitly set the FeatureType.
- Returns:
The feature group after the data_type is applied
- Return type:
- invalidate_streaming_data(invalid_before_timestamp)
Invalidates all streaming data with timestamp before invalidBeforeTimestamp
- Parameters:
invalid_before_timestamp (int) – The unix timestamp, any data which has a timestamp before this time will be deleted
- concatenate_data(source_feature_group_id, merge_type='UNION', replace_until_timestamp=None, skip_materialize=False)
Concatenates data from one feature group to another. Feature groups can be merged if their schema’s are compatible and they have the special updateTimestampKey column and if set, the primaryKey column. The second operand in the concatenate operation will be appended to the first operand (merge target).
- Parameters:
source_feature_group_id (str) – The feature group to concatenate with the destination feature group.
merge_type (str) – UNION or INTERSECTION
replace_until_timestamp (int) – The unix timestamp to specify the point till which we will replace data from the source feature group.
skip_materialize (bool) – If true, will not materialize the concatenated feature group
- remove_concatenation_config()
Removes the concatenation config on a destination feature group.
- Parameters:
feature_group_id (str) – Removes the concatenation configuration on a destination feature group
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describe a Feature Group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group.
- Returns:
The feature group object.
- Return type:
- set_indexing_config(primary_key=None, update_timestamp_key=None, lookup_keys=None)
Sets various attributes of the feature group used for deployment lookups and streaming updates.
- Parameters:
primary_key (str) – Name of feature which defines the primary key of the feature group.
update_timestamp_key (str) – Name of feature which defines the update timestamp of the feature group - used in concatenation and primary key deduplication.
lookup_keys (list) – List of feature names which can be used in the lookup api to restrict the computation to a set of dataset rows. These feature names have to correspond to underlying dataset columns.
- update(description=None)
Modifies an existing feature group
- Parameters:
description (str) – The description about the feature group.
- Returns:
The updated feature group object.
- Return type:
- detach_from_template()
Update a feature group to detach it from a template.
Currently, this converts the feature group into a SQL feature group rather than a template feature group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group.
- Returns:
The updated feature group
- Return type:
- update_template_bindings(template_bindings=None)
Update the feature group template bindings for a template feature group.
- Parameters:
template_bindings (list) – Values in these bindings override values set in the template.
- Returns:
The updated feature group
- Return type:
- update_python_function_bindings(python_function_bindings)
Updates an existing Feature Group’s python function bindings from a user provided Python Function. If a list of feature groups are supplied within the python function
bindings, we will provide as arguments to the function DataFrame’s (pandas in the case of Python) with the materialized feature groups for those input feature groups.
- Parameters:
python_function_bindings (list) – List of arguments to be supplied to the function as parameters in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].
- update_sql_definition(sql)
Updates the SQL statement for a feature group.
- Parameters:
sql (str) – Input SQL statement for the feature group.
- Returns:
The updated feature group
- Return type:
- update_dataset_feature_expression(feature_expression)
Updates the SQL feature expression for a dataset feature group’s custom features
- Parameters:
feature_expression (str) – Input SQL statement for the feature group.
- Returns:
The updated feature group
- Return type:
- update_function_definition(function_source_code=None, function_name=None, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None, use_original_csv_names=False, python_function_bindings=None)
Updates the function definition for a feature group created using createFeatureGroupFromFunction
- Parameters:
function_source_code (str) – Contents of a valid source code file in a supported Feature Group specification language (currently only Python). The source code should contain a function called function_name. A list of allowed import and system libraries for each language is specified in the user functions documentation section.
function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.
input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
cpu_size (str) – Size of the cpu for the feature group function
memory (int) – Memory (in GB) for the feature group function
package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency
use_original_csv_names (bool) – If set to true, feature group uses the original column names for input feature groups from csv datasets.
python_function_bindings (list) – python_function_bindings (List[Python Function Arguments]): List of arguments to be supplied to the function as parameters in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].
- Returns:
The updated feature group
- Return type:
- update_zip(function_name, module_name, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None)
Updates the zip for a feature group created using createFeatureGroupFromZip
- Parameters:
function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.
module_name (str) – Path to the file with the feature group function.
input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
cpu_size (str) – Size of the cpu for the feature group function
memory (int) – Memory (in GB) for the feature group function
package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency
- Returns:
The Upload to upload the zip file to
- Return type:
- update_git(application_connector_id=None, branch_name=None, python_root=None, function_name=None, module_name=None, input_feature_groups=None, cpu_size=None, memory=None, package_requirements=None)
Updates a feature group created using createFeatureGroupFromGit
- Parameters:
application_connector_id (str) – The unique ID associated with the git application connector.
branch_name (str) – Name of the branch in the git repository to be used for training.
python_root (str) – Path from the top level of the git repository to the directory containing the Python source code. If not provided, the default is the root of the git repository.
function_name (str) – Name of the function found in the source code that will be executed (on the optional inputs) to materialize this feature group.
module_name (str) – Path to the file with the feature group function.
input_feature_groups (list) – List of feature groups that are supplied to the function as parameters. Each of the parameters are materialized Dataframes (same type as the functions return value).
cpu_size (str) – Size of the cpu for the feature group function
memory (int) – Memory (in GB) for the feature group function
package_requirements (dict) – Json with key value pairs corresponding to package: version for each dependency
- Returns:
The updated FeatureGroup
- Return type:
- update_feature(name, select_expression=None, new_name=None)
Modifies an existing feature in a feature group. A user needs to specify the name and feature group ID and either a SQL statement or new name to update the feature.
- Parameters:
- Returns:
The updated feature group object.
- Return type:
- list_exports()
Lists all of the feature group exports for a given feature group
- Parameters:
feature_group_id (str) – The ID of the feature group
- Returns:
The feature group exports
- Return type:
- set_modifier_lock(locked=True)
To lock a feature group to prevent it from being modified.
- Parameters:
locked (bool) – True or False to disable or enable feature group modification.
- list_modifiers()
To list users who can modify a feature group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group.
- Returns:
Modification lock status and groups and organizations added to the feature group.
- Return type:
- add_user_to_modifiers(email)
Adds user to a feature group.
- Parameters:
email (str) – The email address of the user to be removed.
- add_organization_group_to_modifiers(organization_group_id)
Add Organization to a feature group.
- Parameters:
organization_group_id (str) – The unique ID associated with the organization group.
- remove_user_from_modifiers(email)
Removes user from a feature group.
- Parameters:
email (str) – The email address of the user to be removed.
- remove_organization_group_from_modifiers(organization_group_id)
Removes Organization from a feature group.
- Parameters:
organization_group_id (str) – The unique ID associated with the organization group.
- delete_feature(name)
Removes an existing feature from a feature group. A user needs to specify the name of the feature to be deleted and the feature group ID.
- Parameters:
name (str) – The name of the feature to be deleted.
- Returns:
The updated feature group object.
- Return type:
- delete()
Removes an existing feature group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group.
- create_version(variable_bindings=None)
Creates a snapshot for a specified feature group.
- Parameters:
variable_bindings (dict) – (JSON Object): JSON object (aka map) defining variable bindings that override parent feature group values.
- Returns:
A feature group version.
- Return type:
- list_versions(limit=100, start_after_version=None)
Retrieves a list of all feature group versions for the specified feature group.
- Parameters:
- Returns:
An array of feature group version.
- Return type:
- create_template(name, template_sql, template_variables, description=None, template_bindings=None, should_attach_feature_group_to_template=False)
Create a feature group template.
- Parameters:
name (str) – The user-friendly of for this feature group template.
template_sql (str) – The template sql that will be resolved by applying values from the template variables to generate sql for a feature group.
template_variables (list) – The template variables for resolving the template.
description (str) – A description of this feature group template
template_bindings (list) – If the feature group will be attached to the newly created template, set these variable bindings on that feature group.
should_attach_feature_group_to_template (bool) – Set to True to convert the feature group to a template feature group and attach it to the newly created template.
- Returns:
The created feature group template
- Return type:
- suggest_template_for()
Suggest values for a feature gruop template, based on a feature group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group to use for suggesting values to use for the template.
- Returns:
None
- Return type:
- get_recent_streamed_data()
Returns recently streamed data to a streaming feature group.
- Parameters:
feature_group_id (str) – The unique ID associated with the feature group.
- create_prediction_metric(prediction_metric_config, project_id=None)
Create a prediction metric job description for the given prediction and actual-labels data.
- Parameters:
- Returns:
The Prediction Metric job description.
- Return type:
- list_prediction_metrics(limit=100, should_include_latest_version_description=True, start_after_id=None)
List the prediction metrics for a feature group.
- Parameters:
limit (int) – The the number of prediction metrics to be retrieved.
should_include_latest_version_description (bool) – include the description of the latest prediction metric version for each prediction metric
start_after_id (str) – An offset parameter to exclude all prediction metrics till the specified prediction metric ID.
- Returns:
The prediction metrics for this feature group.
- Return type:
- query_prediction_metrics(project_id=None, limit=100, should_include_latest_version_description=True, start_after_id=None)
Query and return prediction metrics and extra data needed by the UI, constrained by the parameters provided.
- feature_group_id (Unique String Identifier): [optional] The feature group used as input to the prediction metrics.
project_id (Unique String Identifier): [optional] The project_id of the prediction metrics. limit (Integer): The the number of prediction metrics to be retrieved. should_include_latest_version_description (Boolean): include the description of the latest prediction metric version for each prediction metric start_after_id (Unique String Identifier): An offset parameter to exclude all prediction metrics till the specified prediction metric ID.
- Parameters:
- Returns:
The prediction metrics for this feature group.
- Return type:
- upsert_data(streaming_token, data)
Updates new data into the feature group for a given lookup key recordId if the recordID is found otherwise inserts new data into the feature group.
- append_data(streaming_token, data)
Appends new data into the feature group for a given lookup key recordId.
- upsert_multiple_data(streaming_token, data)
Updates new data into the feature group for a given lookup key recordId if the recordID is found otherwise inserts new data into the feature group.
- append_multiple_data(streaming_token, data)
Appends new data into the feature group for a given lookup key recordId.
- wait_for_dataset(timeout=7200)
A waiting call until the feature group’s dataset, if any, is ready for use.
- Parameters:
timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.
- wait_for_upload(timeout=7200)
Waits for a feature group created from a dataframe to be ready for materialization and version creation.
- Parameters:
timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.
- wait_for_materialization(timeout=7200)
A waiting call until feature group is materialized.
- Parameters:
timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 7200 seconds.
- wait_for_streaming_ready(timeout=600)
Waits for the feature group indexing config to be applied for streaming
- Parameters:
timeout (int, optional) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out. Default value given is 600 seconds.
- get_status(streaming_status=False)
Gets the status of the feature group.
- load_as_pandas()
Loads the feature groups into a python pandas dataframe.
- Returns:
A pandas dataframe with annotations and text_snippet columns.
- Return type:
DataFrame
- describe_dataset()
Displays the dataset attached to a feature group.
- Returns:
A dataset object with all the relevant information about the dataset.
- Return type:
- materialize()
Materializes the feature group’s latest change at the api call time. It’ll skip materialization if no change since the current latest version.
- Returns:
A feature group object with the lastest changes materialized.
- Return type: