kedro.io.DataCatalog

class kedro.io.DataCatalog(data_sets=None, feed_dict=None)[source]

DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets. Then it will act as a single point of reference for your calls, relaying load and save functions to the underlying data sets.

__init__(data_sets=None, feed_dict=None)[source]

DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program. To use a DataCatalog, you need to instantiate it with a dictionary of data sets. Then it will act as a single point of reference for your calls, relaying load and save functions to the underlying data sets.

Parameters:
  • data_sets (Optional[Dict[str, AbstractDataSet]]) – A dictionary of data set names and data set instances.
  • feed_dict (Optional[Dict[str, Any]]) – A feed dict with data to be added in memory.

Example:

from kedro.io import CSVLocalDataSet

cars = CSVLocalDataSet(filepath="cars.csv",
                       load_args=None,
                       save_args={"index": False})
io = DataCatalog(data_sets={'cars': cars})
Return type:None

Methods

__init__([data_sets, feed_dict]) DataCatalog stores instances of AbstractDataSet implementations to provide load and save capabilities from anywhere in the program.
add(data_set_name, data_set[, replace]) Adds a new AbstractDataSet object to the DataCatalog.
add_all(data_sets[, replace]) Adds a group of new data sets to the DataCatalog.
add_feed_dict(feed_dict[, replace]) Adds instances of MemoryDataSet, containing the data provided through feed_dict.
exists(name) Checks whether registered data set exists by calling its exists() method.
from_config(catalog[, credentials, …]) Create a DataCatalog instance from configuration.
list() List of DataSet names registered in the catalog.
load(name) Loads a registered data set.
save(name, data) Save data to a registered data set.
shallow_copy() Returns a shallow copy of the current object.