The order in which stages or tasks of a pipeline are arranged are set explicitly by the @follows(...) python decorator:
from ruffus import * import sys def first_task(): print "First task" @follows(first_task) def second_task(): print "Second task" @follows(second_task) def final_task(): print "Final task"the @follows decorator indicate that the first_task function precedes second_task in the pipeline.
Note
We shall see in Chapter 2 that the order of pipeline tasks can also be inferred implicitly for the following decorators
- Now we can run the pipeline by:
pipeline_run([final_task])Because final_task depends on second_task which depends on first_task , all three functions will be executed in order.
- We can see a flowchart of our fledgling pipeline by executing:
pipeline_printout_graph ( "manual_follows1.png", "png", [final_task], no_key_legend=True)producing the following flowchart
![]()
- or in text format with:
pipeline_printout(sys.stdout, [final_task])- which produces the following:
Task = first_task Task = second_task Task = final_task
All this assumes that all your pipelined tasks are defined in order. (first_task before second_task before final_task)
This is usually the most sensible way to arrange your code.If you wish to refer to tasks which are not yet defined, you can do so by quoting the function name as a string:
@follows("second_task") def final_task(): print "Final task"You can refer to tasks (functions) in other modules, in which case the full qualified name must be used:
@follows("other_module.second_task") def final_task(): print "Final task"
Each task can depend on more than one antecedent task.
- This can be indicated either by stacking @follows:
@follows(first_task) @follows("second_task") def final_task(): ""- or in a more concise way:
@follows(first_task, "second_task") def final_task(): ""
A common prerequisite for any computational task, is making sure that the destination directories exist.
Ruffus provides special syntax to support this, using the special mkdir dependency. For example:
@follows(first_task, mkdir("output/results/here")) def second_task(): print "Second task"will make sure that output/results/here exists before second_task is run.
In other words, it will make the output/results/here directory if it does not exist.