Online Program

Friday, February 20
CS08 Exploratory and Interactive Graphics Fri, Feb 20, 11:00 AM - 12:30 PM
Maurepas

Visualizing Data with Exploratory Data Analysis (302932)

*Wendy L. Martinez, U.S. Bureau of Labor Statistics 

Keywords: Visualization, parallel coordinates, clustering, scatter plots, smoothing, Andrews curves

Exploratory data analysis (EDA) is an important first step in any data analytic task. The idea behind EDA is to explore and examine the data from different aspects before any hypotheses or models have been developed. In this way, one explores the data looking for information about relationships and structure that could inform the rest of the analysis. In this talk, I will briefly define EDA and the two main areas it includes: pattern discovery and visualization. The main focus of the talk will be on visualization techniques used in EDA that can be used to look for interesting structures such as outliers, holes, and clusters. Examples of these include scatter plot matrices and smoothing, Andrews’ curves and images, parallel coordinate plots, brushing and linking, and tree maps. I will use publicly available data from different disciplines and applications to illustrate the concepts. These examples will be implemented in R or MATLAB, and the code used to create the visualizations will be given. The goal of this presentation is to help the audience members learn about these methods and to provide examples that enable them to use the ideas in their own statistical analyses.