Key:

Computational Statistics

Data Science Technologies

Data Visualization

Education

Machine Learning

Practice and Applications

Software

Wednesday, May 29

Registration
SDSS Hours

Wed, May 29, 7:00 AM - 6:30 PM
Grand Ballroom Foyer

SC1 - Welcome to the Tidyverse: An Introduction to R for Data Science
Short Course

Wed, May 29, 8:00 AM - 5:30 PM
Grand Ballroom E

Instructor(s): Garrett Grolemund, RStudio

Looking for an effective way to learn R? This one day course will teach you a workflow for doing data science with the R language. It focuses on using R's Tidyverse, which is a core set of R packages that are known for their impressive performance and ease of use. We will focus on doing data science, not programming. You'll learn to:

* Visualize data with R's ggplot2 package * Wrangle data with R's dplyr package * Fit models with base R, and * Document your work reproducibly with R Markdown

Along the way, you will practice using R's syntax, gaining comfort with R through many exercises and examples. Bring your laptop! The workshop will be taught by Garrett Grolemund, an award winning instructor and the co-author of _R for Data Science_.

SC2 - Modeling in the Tidyverse
Short Course

Wed, May 29, 8:00 AM - 5:30 PM
Grand Ballroom F

Instructor(s): Max Kuhn, RStudio

The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. In the last two years, a suite of tidyverse packages have been created that focus on modeling. This course walks through the process of modeling data using these tools. A focus is on modeling for prediction and inference as well as feature engineering.

SC3 - Data Visualization: Principles and Applications in R, Tableau, and Python
Short Course

Wed, May 29, 8:00 AM - 12:00 PM
Grand Ballroom G

Instructor(s): Silas Bergen, Winona State University; Todd Iverson, Winona State University

In this course, participants will be introduced to principles of data visualization from foundational literature and implement these principles with hands-on activities using Tableau Public, Python (Altair), and R (ggplot). The course instructors have experience teaching these concepts and content as part of undergraduate statistics and data science curricula, and will use example class projects from these courses. The course will be divided into two modules. Module 1 will cover the principles of data visualization theory, summarizing and illustrating foundational data visualization literature. Module 2 will demonstrate how these principles are applied in various software platforms. Hands-on data visualization tasks will be employed throughout. Participants must bring their own laptops.

SC4 - Reproducible Research with R
Short Course

Wed, May 29, 8:00 AM - 12:00 PM
Grand Ballroom I

Instructor(s): Kara Woo, Sage Bionetworks

This course will introduce learners to reproducible workflows in R using R Markdown. We will discuss what reproducible research is, why it is important, and what common issues hinder reproducibility. The workshop will guide learners through hands-on exercises in R Markdown and show them how to create reproducible reports and share them on GitHub.

SC5 - Introduction to Deep Learning
Short Course

Wed, May 29, 1:30 PM - 5:30 PM
Grand Ballroom G

Instructor(s): Kevin Kuo, RStudio; Javier Luraschi, RStudio

Practical introduction to neural networks with interactive coding exercises in R. We provide an overview of different type of neural network architectures and how they can be applied in a variety of applications.

SC6 - Text Mining with Tidy Data Principles
Short Course

Wed, May 29, 1:30 PM - 5:30 PM
Grand Ballroom I

Instructor(s): Mara Averick, RStudio; Julia Silge, Stack Overflow

Text data is increasingly important in many domains, and tidy data principles and tidy tools can make text mining easier and more effective. In this short course, learn how to manipulate, summarize, and visualize the characteristics of text using these methods and R packages from the tidy tool ecosystem. These tools are highly effective for many analytical questions and allow analysts to integrate natural language processing into effective workflows already in wide use. Explore how to implement approaches such as sentiment analysis of texts, measuring tf-idf, and building text models.

Exhibits Open
SDSS Hours

Wed, May 29, 5:30 PM - 7:00 PM
Grand Ballroom Foyer

PS01 - Opening Mixer & E-Posters
E-Poster

Wed, May 29, 5:30 PM - 7:00 PM
Grand Ballroom Foyer

Spatial Statistics and Visualization of Public Health Outcomes
Presentation Weichuan Dong, Kent State University

Teaching the ASA Guidelines in a Cross-Cultural Setting
Jing Cao, Southern Methodist University

The Daily Question: Building Student Trust and Interest in Undergraduate Introductory Probability and Statistics Courses
Presentation Matthew A. Hawks, US Naval Academy

Extending the Grammar of Graphics beyond ggplot2
Silas Bergen, Winona State University

Using Data Science to Support Enrollment Decisions in Higher Education
Monica M King, Drexel University

Data-Driven College Admissions: Useful Metrics or Numeric Nonsense?
Emily Rose Flanagan, University of Washington

Using Data Verbs to Teach the Management of Tabular Data
Chris John Malone, Winona State University

A Shiny Application to Teach the Multiple Linear Regression Analysis in a Undergraduate Course
Presentation Carlos M. Lopera-Gómez, Universidad Nacional de Colombia

Predicting Matriculation Rates of Dual Enrollment High School Students
Presentation Benjamin Kenneth Brown, Oregon Institute of Technology

A Meta-analysis on the Effect of Information and Communication Technology Tools in Second Language Acquisition
Presentation Songtao Wang, University of Victoria

Building Statistical Understanding to Support Organizational Data Culture
Karin Neff, BSD7

SDSS 2019 Hackathon Kickoff
Special Session

Wed, May 29, 6:30 PM - 8:30 PM
Grand Ballroom E

This will be the inaugural year of the Symposium on Data Science and Statistics (SDSS) Hackathon! The goal of the hack is present real world consulting experience that will be mutually beneficial to the industry sponsor and conference participants. Teams will unite participants from diverse academic and industrial backgrounds with statistical and data science skills with the goal of presenting implementable solutions.

We worked in conjunction with the eScience Institute at University of Washington in Seattle to identify a rich data source and prompt that gives back to the greater Seattle community. Thus, the theme for this year's hackathon will be the housing crisis in the Pacific Northwest that has greatly affected Seattle and Portland. This is a topic that has many perspectives and stakeholders; activists, lawyers, statewide legislature. The datasets we have for the hack present a rich diversity of problems that can be approach from a statistical and data science lens. Participants will be working with data from different levels of geography and from a variety of sources including the American Community Survey, Zillow, Hack Oregon, and other publicly available data pertaining to homelessness and housing insecurity.

This will be a great opportunity for participants to work on a real data problem, learn from professionals in the field, and build relationships with fellow participants, which will enhance the conference experience. We especially encourage students and early career attendees to participate.

Go to the SDSS Events Page to sign-up today!

Online Program

Key:

ASA Meetings Department