kedro.pipeline.Pipeline¶
-
class
kedro.pipeline.
Pipeline
(nodes, name=None)[source]¶ A
Pipeline
defined as a collection ofNode
objects. This class treats nodes as part of a graph representation and provides inputs, outputs and execution order.-
__init__
(nodes, name=None)[source]¶ Initialise
Pipeline
with a list ofNode
instances.Parameters: - nodes (
Iterable
[Union
[Node
,Pipeline
]]) – The list of nodes thePipeline
will be made of. If you provide pipelines among the list of nodes, those pipelines will be expanded and all their nodes will become part of this new pipeline. - name (
Optional
[str
]) – The name of the pipeline. If specified, this name will be used to tag all of the nodes in the pipeline.
Raises: ValueError
– When an empty list of nodes is provided, or when not all nodes have unique names.CircularDependencyError
– When visiting all the nodes is not possible due to the existence of a circular dependency.OutputNotUniqueError
– When multipleNode
instances produce the same output.
Example:
from kedro.pipeline import Pipeline from kedro.pipeline import node # In the following scenario first_ds and second_ds # are data sets provided by io. Pipeline will pass these # data sets to first_node function and provides the result # to the second_node as input. def first_node(first_ds, second_ds): return dict(third_ds=first_ds+second_ds) def second_node(third_ds): return third_ds pipeline = Pipeline([ node(first_node, ['first_ds', 'second_ds'], ['third_ds']), node(second_node, dict(third_ds='third_ds'), 'fourth_ds')]) pipeline.describe()
- nodes (
Methods
__init__
(nodes[, name])Initialise Pipeline
with a list ofNode
instances.all_inputs
()All inputs for all nodes in the pipeline. all_outputs
()All outputs of all nodes in the pipeline. data_sets
()The names of all data sets used by the Pipeline
, including inputs and outputs.decorate
(*decorators)Create a new Pipeline
by applying the provided decorators to all the nodes in the pipeline.describe
([names_only])Obtain the order of execution and expected free input variables in a loggable pre-formatted string. from_inputs
(*inputs)Create a new Pipeline
object with the nodes which dependfrom_nodes
(*node_names)Create a new Pipeline
object with the nodes which depend directly or transitively on the provided nodes.inputs
()The names of free inputs that must be provided at runtime so that the pipeline is runnable. only_nodes
(*node_names)Create a new Pipeline
which will contain only the specified nodes by name.only_nodes_with_inputs
(*inputs)Create a new Pipeline
object with the nodes which dependonly_nodes_with_outputs
(*outputs)Create a new Pipeline
object with the nodes which are directly required to produce the provided outputs.only_nodes_with_tags
(*tags)Create a new Pipeline
object with the nodes which contain any of the provided tags.outputs
()The names of outputs produced when the whole pipeline is run. to_json
()Return a json representation of the pipeline. to_nodes
(*node_names)Create a new Pipeline
object with the nodes required directly or transitively by the provided nodes.to_outputs
(*outputs)Create a new Pipeline
object with the nodes which are directly or transitively required to produce the provided outputs.Attributes
grouped_nodes
Return a list of the pipeline nodes in topologically ordered groups, i.e. name
Get the pipeline name. node_dependencies
All pairs of nodes where the first Node has a direct dependency on the second Node. nodes
Return a list of the pipeline nodes in topological order, i.e. -