Online Program Home
My Program

Abstract Details

Activity Number: 189 - Contributed Poster Presentations: Section on Statistical Education
Type: Contributed
Date/Time: Monday, July 29, 2019 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics and Data Science Education
Abstract #307055
Title: Conditional Probability and SQL for Data Science
Author(s): Eric Suess*
Companies: CSU East Bay
Keywords: Conditional Probability; SQL; R; sqlite; counting; Data Science

With the increased emphasis being placed on the development of Data Science skills within the traditional Statistics curriculum, including SQL into various Probability and Statistics courses is important. We present a examples of using SQL commands to compute estimated probabilities and conditional probabilities with large data sets, both real and simulated.

Probabilities are computed as counting occurences of specified events for the full data set. Conditional probabilities are computed as occurences of specified event for subsets of the data set.

Examples will be given for three levels of instruction, introductory Statistics classes, undergraduate Statistics majors, and for MS Statistics student. SQL will be implemented in R and using sqlite.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program