Statisticians and data scientists need to be able to "think with data" in order to answer statistical questions that arise from the flood of data that are now available. In this talk, I will introduce a set of key idioms due to Hadley Wickham that provide a framework to teach data management skills and facilitate loading, merging, and transforming large datasets.
This talk will demonstrate these idioms implemented in new packages in R (namely readr, dplyr, haven, lubridate, mosaic, rvest, stringr, and tidyr) to ingest, manage, transform, analyze, and model data. You'll see that it is easy to learn to use these packages, and that it is very worthwhile to do so. The talk provides a headstart on learning, then points out the next steps. No prior experience with R is expected.