Looking for an effective way to learn R? This one day course will teach you a workflow for doing data science with the R language. It focuses on using R's Tidyverse, which is a core set of R packages that are known for their impressive performance and ease of use. We will focus on doing data science, not programming. You'll learn to:
* Visualize data with R's ggplot2 package * Wrangle data with R's dplyr package * Fit models with base R, and * Document your work reproducibly with R Markdown
Along the way, you will practice using R's syntax, gaining comfort with R through many exercises and examples. Bring your laptop! The workshop will be taught by Garrett Grolemund, an award winning instructor and the co-author of _R for Data Science_.
The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. In the last two years, a suite of tidyverse packages have been created that focus on modeling. This course walks through the process of modeling data using these tools. A focus is on modeling for prediction and inference as well as feature engineering.
In this course, participants will be introduced to principles of data visualization from foundational literature and implement these principles with hands-on activities using Tableau Public, Python (Altair), and R (ggplot). The course instructors have experience teaching these concepts and content as part of undergraduate statistics and data science curricula, and will use example class projects from these courses. The course will be divided into two modules. Module 1 will cover the principles of data visualization theory, summarizing and illustrating foundational data visualization literature. Module 2 will demonstrate how these principles are applied in various software platforms. Hands-on data visualization tasks will be employed throughout. Participants must bring their own laptops.
This course will introduce learners to reproducible workflows in R using R Markdown. We will discuss what reproducible research is, why it is important, and what common issues hinder reproducibility. The workshop will guide learners through hands-on exercises in R Markdown and show them how to create reproducible reports and share them on GitHub.
Text data is increasingly important in many domains, and tidy data principles and tidy tools can make text mining easier and more effective. In this short course, learn how to manipulate, summarize, and visualize the characteristics of text using these methods and R packages from the tidy tool ecosystem. These tools are highly effective for many analytical questions and allow analysts to integrate natural language processing into effective workflows already in wide use. Explore how to implement approaches such as sentiment analysis of texts, measuring tf-idf, and building text models.
New horizons and controversies seem to emerge constantly in the world of statistics and data science. Who can keep up? Our distinguished panel of statistics and data science leaders will discuss this and more in an informal and wide-reaching conversation that contextualizes the SDSS experience with issues of the day.
Considering how to incorporate data science into your high school STEM classroom?
The goal of this workshop is for you to leave with data science skills and applicable examples that can be used in your classroom.
This workshop will answer questions like:
• What is data science?
• How can high schoolers prepare for data science courses in college?
• What does a career in data science involve?
We will walk through how data scientists carry out projects using RStudio, introduce the basics of the R programming language, and work with real datasets to generate visualizations and analyze data.
Note: Advance sign-up is required, so please see the SDSS 2019 Events page for details!