Saturday, May 19

Bioinformatics

Sat, May 19, 10:30 AM - 12:00 PM
Regency Ballroom A

Big Data Distributed System for Phenome and Genome Management and Analysis in a Large Health System (304605)

Aaron Black, Inova Translational Medicine Institute
John F Deeken, Inova Translational Medicine Institute
Shan Gao, Inova Translational Medicine Institute
Henry Hunter, Inova Translational Medicine Institute
Prachi Kothiyal, Inova Translational Medicine Institute
Xinyue Liu, Inova Translational Medicine Institute
Sakthi Madhappan, Inova Translational Medicine Institute
John E Niederhuber, Inova Translational Medicine Institute
Lin Smith, Inova Translational Medicine Institute
*Wendy S.W. Wong, Inova Translational Medicine Institute
Fang Zhou, Inova Translational Medicine Institute
Wei Zhu, Inova Translational Medicine Institute

Keywords: big data, health systems, genomics, Hadoop, spark, Cloudera

The continuous incoming of High Throughput Sequencing data quickly overwhelms the bioinformatics analysis paradigm based on traditional clusters and relational databases. Innovative "Big data" solutions built on the open-source Apache Hadoop and Spark cluster technology have been employed to address the challenge. ADAM and Hail are two of the cutting-edge projects in the area of big data genomics. To leverage these powerful new tools while considering the practical applications to support Inova Health System's translational genomic research, we are building an integrated system composed of a Hadoop data warehouse (DW) with Cloudera Impala as the backend, an ETL (Extraction, Transformation, Loading) workflow using ADAM and Spark, an analysis platform middle tier powered by Spark and Hail, and a web front-end for ad hoc query and interactive data analysis. Examples on use cases are presented to demonstrate the power of our integrative big data genomic system for handling petabyte-scale data.

Online Program

Big Data Distributed System for Phenome and Genome Management and Analysis in a Large Health System (304605)

ASA Meetings Department