kedro.io

Description

kedro.io provides functionality to read and write to a number of data sets. At core of the library is AbstractDataSet which allows implementation of various AbstractDataSets.

Data Catalog

kedro.io.DataCatalog([data_sets, feed_dict, …]) DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program.

Data Sets

kedro.io.CSVLocalDataSet(filepath[, …]) CSVLocalDataSet loads and saves data to a local csv file.
kedro.io.CSVS3DataSet(filepath, bucket_name) CSVS3DataSet loads and saves data to a file in S3.
kedro.io.HDFLocalDataSet(filepath, key[, …]) HDFLocalDataSet loads and saves data to a local hdf file.
kedro.io.JSONLocalDataSet(filepath[, …]) JSONLocalDataSet encodes data as json and saves it to a local file or reads in and decodes an existing json file.
kedro.io.LambdaDataSet(load, save[, exists]) LambdaDataSet loads and saves data to a data set.
kedro.io.MemoryDataSet([data, max_loads]) MemoryDataSet loads and saves data from/to an in-memory Python object.
kedro.io.ParquetLocalDataSet(filepath[, …]) AbstractDataSet with functionality for handling local parquet files.
kedro.io.PickleLocalDataSet(filepath[, …]) PickleLocalDataSet loads and saves a Python object to a local pickle file.
kedro.io.PickleS3DataSet(filepath, bucket_name) PickleS3DataSet loads and saves a Python object to a pickle file on S3.
kedro.io.SQLTableDataSet(table_name, credentials) SQLTableDataSet loads data from a SQL table and saves a pandas dataframe to a table.
kedro.io.SQLQueryDataSet(sql, credentials[, …]) SQLQueryDataSet loads data from a provided SQL query.
kedro.io.TextLocalDataSet(filepath[, …]) TextLocalDataSet loads and saves unstructured text files.
kedro.io.ExcelLocalDataSet(filepath[, …]) ExcelLocalDataSet loads and saves data to a local Excel file.

Additional AbstractDataSet implementations can be found in kedro.contrib.io.

Errors

kedro.io.DataSetAlreadyExistsError DataSetAlreadyExistsError raised by DataCatalog class in case of trying to add a data set which already exists in the DataCatalog.
kedro.io.DataSetError DataSetError raised by AbstractDataSet implementations in case of failure of input/output methods.
kedro.io.DataSetNotFoundError DataSetNotFoundError raised by DataCatalog class in case of trying to use a non-existing data set.

Base Classes

kedro.io.AbstractDataSet AbstractDataSet is the base class for all data set implementations.
kedro.io.FilepathVersionMixIn Mixin class which helps to version filepath-like data sets.
kedro.io.S3PathVersionMixIn Mixin class which helps to version S3 data sets.
kedro.io.Version This namedtuple is used to provide load and save versions for versioned data sets.