Abstract:
|
Data Science is an emerging interdisciplinary field that combines elements of mathematics, statistics, and computer science for the purpose of extracting meaningful information from data. These data tend to be non-traditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level imbues students with a variety of techniques to analyze small, neat, and clean data sets. However, many of these students will end up working with data that is considerably more complex, and will need facility with statistical computing techniques. More importantly, these students will need a framework for thinking structurally about data. We describe an undergraduate course in a liberal arts environment that provides students with the tools necessary to conduct data science. The course emphasizes modern, practical, and useful skills that cover the full data analysis spectrum, from asking an interesting question to acquiring, managing, manipulating, processing, querying, analyzing, and visualizing data, as well communicating findings in written, graphical, and oral forms.
|
Copyright © American Statistical Association.