Cheat Sheet

The ruffus module is a lightweight way to add support for running computational pipelines.

Each stage or task in a computational pipeline is represented by a python function
Each python function can be called in parallel to run multiple jobs.

1. Annotate functions with Ruffus decorators

Basic

Decorator Syntax  
@follows (Manual)
@follows ( task1, 'task2' ))
@follows ( task1, mkdir( 'my/directory/for/results' ))
 
@files (Manual)
@files( parameter_list )
@files( parameter_generating_function )
@files ( input_file, output_file, other_params, ... )
 

Core

Decorator Syntax  
@split (Manual) @split ( tasks_or_file_names, output_files, [extra_parameters,...] )  
@transform (Manual)
@transform ( tasks_or_file_names, suffix(suffix_string), output_pattern, [extra_parameters,...] )
@transform ( tasks_or_file_names, regex(regex_pattern), output_pattern, [extra_parameters,...] )
 
@merge (Manual) @merge (tasks_or_file_names, output, [extra_parameters,...] )  
@posttask (Manual)
@posttask ( signal_task_completion_function )
@posttask (touch_file( 'task1.completed' ))
 

See Decorators for a complete list of decorators

3. Run the pipeline

pipeline_run(list_of_target_tasks, [list_of_tasks_forced_to_rerun, multiprocess = N_PARALLEL_JOBS])

See the Simple Tutorial for a quick introduction on how to add support for ruffus.