Friday, February 24
CS04 Keep It Simple with R Fri, Feb 24, 9:15 AM - 10:45 AM
City Terrace 9

Managing Many Models (303291)

*Hadley Wickham, RStudio 

Visualization alone is not enough to solve most data analysis challenges. The data may be too big or too messy to show in a single plot. In this talk, I'll outline my current thinking about how the synthesis of visualization, modelling, and data manipulation allows you to effectively explore and understand large and complex datasets. There are three key ideas:

1. Using tidyr to make nested data frame, where one column is a list of data frames. 2. Using purrr to use function programming tools instead of writing for loops 3. Visualizing models by converting them to tidy data with broom, by David Robinson.

This work is embedded in R so I'll not only talk about the ideas, but show concrete code for working with large sets of models. You'll see how you can combine the dplyr and purrr packages to fit many models, then use tidyr and broom to convert to tidy data which can be visualized with ggplot2.