Omics pipe is an open-source, modular computational platform that automates ‘best practice’ multi-omics data analysis pipelines published in Nature Protocols and other commonly used pipelines, such as GATK. It currently automates and provides summary reports for two RNA-seq pipelines, two miRNA-seq pipelines, variant calling from whole exome sequencing (WES), variant calling and copy number variation analysis from whole genome sequencing (WGS), two ChIP-seq pipelines and a custom RNA-seq pipeline for personalized genomic medicine reporting. It also provides automated support for interacting with the The Cancer Genome Atlas (TCGA) datasets, including automatic download and processing of the samples in this database.
Omics pipe is a Python package that can be installed on a compute cluster, a local installation or in the cloud. It can be downloaded directly from the Omics pipe website for local and cluster installation, or can be used on AWS in Amazon EC2. The modular nature of Omics pipe allows researchers to easily and efficiently add new analysis tools with Bash scripts in the form of modules that can then be used to assemble a new analysis pipeline. Omics pipe uses Ruffus to pipeline the various analysis modules together into a parallel, automated pipeline. The dependence of Omics pipe on Ruffus also allows for the restarting of only the steps in the pipeline that need updating in the event of an error. In addition, Sumatra is built into Omics pipe, which provides version control for each run of the pipeline, increasing the reproducibility and documentation of your analyses. Omics pipe interacts with the Distributed Resource Management Application API (DRMAA), which automatically submits, controls and monitors jobs to a Distributed Resource Management system, such as a compute cluster or Grid computing infrastructure. This allows you to run samples and steps in the pipeline in parallel in a computationally efficient distributed fashion, without the need to individually schedule and monitor individual jobs. For each supported pipeline in Omics pipe, results files from each step in the pipeline are generated, and an analysis summary report is generated as an HTML report using the R package knitr. The summary report provides quality control metrics and visualizations of the results for each sample to enable researchers to quickly and easily interpret the results of the pipeline.
Projects that have used Omics Pipe for solving biological problems. Please submit your story if you would like to share how you use the pipeline for your own research.
Omics Pipe is developed by Kathleen Fisch, Tobias Meissner and Louis Gioia at The Su Lab in the Department of Molecular and Experimental Medicine at The Scripps Research Institute in beautiful La Jolla, CA.
Feedback, questions, bug reports, contributions, collaborations, etc. welcome!
Email: kfisch@scripps.edu
Twitter: @kathleenfisch