kedro.contrib.io.azure package

AbstractDataSet implementation for reading/writing data to Azure Blob Storage

Submodules

kedro.contrib.io.azure.csv_blob module

AbstractDataSet implementation to access CSV files directly from Microsoft’s Azure blob storage.

class kedro.contrib.io.azure.csv_blob.CSVBlobDataSet(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]

Bases: kedro.io.core.AbstractDataSet

CSVBlobDataSet loads and saves csv files in Microsoft’s Azure blob storage. It uses azure storage SDK to read and write in azure and pandas to handle the csv file locally.

Example:

import pandas as pd

data = pd.DataFrame({'col1': [1, 2], 'col2': [4, 5],
                     'col3': [5, 6]})

data_set = CSVBlobDataSet(filepath="test.csv",
                           bucket_name="test_bucket",
                           load_args=None,
                           save_args={"index": False})
data_set.save(data)
reloaded = data_set.load()

assert data.equals(reloaded)
__init__(filepath, container_name, credentials, blob_to_text_args=None, blob_from_text_args=None, load_args=None, save_args=None)[source]

Creates a new instance of CSVBlobDataSet pointing to a concrete csv file on Azure blob storage.

Parameters:
Return type:

None