St. James Ballroom
Visual Exploration of Big Biological Data (303893)
*Lauren Holt Lenz, Utah State UniversityJohn R. Stevens, Utah State University
Keywords: Data Visualization, Big Data, BigVis, Tidyverse, Exploratory Data Analysis
Identifying variables of interest in big data can be time consuming and difficult. Statistical visualizations help with the process, but over-plotting and computation times are often restrictive. Hadley Wickham’s "bigvis" R package extends "ggplot2" to big data, and allows the user to create quick, clean, informative plots in one, two, and three variables. Here visualization best practices are applied to "bigvis" to identify variables of interest to be explored further, with special consideration given to three-variable plots. Data sets presented here include RNA-Seq read-count data sets from cancer and agriculture, but these principles of statistical visualization can be used in the exploration of any large data set.