Online Program

Return to main conference page
Thursday, May 17
Data Science
Best Practices in Data Science Education
Thu, May 17, 10:30 AM - 12:00 PM
Grand Ballroom G

Start with Data Science as an Introduction to Statistical Thinking (304350)


*Mine Cetinkaya-Rundel, Duke University & RStudio 

Keywords: data science, statistics, intro stats, R, reproducibility, data visualization, data wrangling, git, GitHub, rmarkdown

The introductory statistics course has evolved over the years and taken various forms depending on its target audience. In this talk we discuss a data science course designed to serve as a gateway to the discipline of statistics, the statistics major, and broadly to quantitative studies. The course is intended for an audience of students with little to no computing or statistical background, and focuses on data wrangling, exploratory data analysis, data visualization, and effective communication. Unlike most traditional introductory statistics courses, this course approaches statistics from a model-based perspective and introduces simulation-based and Bayesian inference later in the course. A heavy emphasis is placed on reproducibility (with R Markdown) and version control and collaboration (with git/GitHub). We will discuss in detail the course structure, logistics, and pedagogical considerations as well as give examples from the case studies used in the course. We will also share student feedback, assessment of the success of the course in recruiting students to the statistical science major, and our experience of growing the course from a small seminar course for first-year undergraduates to a larger course open to the entire undergraduate student body.