Online Program

Return to main conference page
Friday, May 31
Data Science Techologies
Data Science Platforms: Spark
Fri, May 31, 10:30 AM - 12:05 PM
Grand Ballroom E

An R Interface to Hail (305027)


*Michael Lawrence, Genentech Research 

Keywords: spark,hail,R

Hail is a Spark-based framework for genomic computing at scale. We have explored the application of deferred evaluation to the construction of an interface between R and Hail. The interface implements standard base R and Bioconductor APIs on top of Hail by constructing expressions in Hail's interface language and evaluating them using sparklyr. Users require no special knowledge of Hail or Spark. We will describe the design of the interface and demonstrate the manipulation of a Hail-backed SummarizedExperiment object, the core abstraction for genomic data in Bioconductor.