kedro.contrib.io.azure package¶
AbstractDataSet
implementation for reading/writing data to Azure Blob
Storage
Submodules¶
kedro.contrib.io.azure.csv_blob module¶
AbstractDataSet
implementation to access CSV files directly from
Microsoft’s Azure blob storage.
-
class
kedro.contrib.io.azure.csv_blob.
CSVBlobDataSet
(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]¶ Bases:
kedro.io.core.AbstractDataSet
CSVBlobDataSet
loads and saves csv files in Microsoft’s Azure blob storage. It uses azure storage SDK to read and write in azure and pandas to handle the csv file locally.Example:
import pandas as pd data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5], 'col3': [5, 6]}) data_set = CSVBlobDataSet(filepath="test.csv", bucket_name="test_bucket", load_args=None, save_args={"index": False}) data_set.save(data) reloaded = data_set.load() assert data.equals(reloaded)
-
__init__
(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]¶ Creates a new instance of
CSVBlobDataSet
pointing to a concrete csv file on Azure blob storage.Parameters: - filepath (
str
) – path to a azure blob of a csv file. - container_name (
str
) – Azure container name. - credentials (
Dict
[str
,Any
]) – Credentials (account_name
andaccount_key
orsas_token
)to access the azure blob - blob_to_text_args (
Optional
[Dict
[str
,Any
]]) – Any additional arguments to pass to azure’sget_blob_to_text
method: https://docs.microsoft.com/en-us/python/api/azure.storage.blob.baseblobservice.baseblobservice?view=azure-python#get-blob-to-text - blob_from_text_args (
Optional
[Dict
[str
,Any
]]) – Any additional arguments to pass to azure’screate_blob_from_text
method: https://docs.microsoft.com/en-us/python/api/azure.storage.blob.blockblobservice.blockblobservice?view=azure-python#create-blob-from-text - load_args (
Optional
[Dict
[str
,Any
]]) – Pandas options for loading csv files. Here you can find all available arguments: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html All defaults are preserved. - save_args (
Optional
[Dict
[str
,Any
]]) – Pandas options for saving csv files. Here you can find all available arguments: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html All defaults are preserved, but “index”, which is set to False.
Return type: None
- filepath (
-