All Times ET
Data visualization plays a crucial role in the data science and statistics workflows. It is fundamental to everything from exploratory data analysis to communicating results. Data scientists and statisticians can better understand data and more effectively communicate their work by understanding how to better visualize their data. Too often, however, visualization is an afterthought.
In this course, attendees will learn the core principles of data visualization how we perceive visual information; the layered grammar of graphics; and best practices for creating effective visualizations. To put these principles to work, attendees will learn practical skills for R programming that improve the quality of their work and teach them to program away the mundane. The course will focus on the popular R package ggplot2 and the reproducible research framework R Markdown. All R instruction will begin with a clear motivation, followed by an explanation of the approach and code and ending with hands-on examples.
This short course is for those who are new to data science and interested in understanding the cutting-edge machine learning and deep learning models. It is for those who want to become familiar with the core concepts behind these learning algorithms and their successful applications and who want to start thinking about how machine learning and deep learning might be useful in their research, business, or career development. The course will provide a comprehensive overview of statistical machine learning and deep learning methods. Topics include classical methods and modern techniques, including basic machine learning tools, supervised and unsupervised learning, deep neural network, computational algorithms and software of deep learning, and various applications in deep learning.
Data science and statistics become more important in society every year—as a prime example, consider the sudden influx of public interest in COVID-19 tracking projects such as the tracker from 1Point3Acres. From published research that guides policy to the online predictive systems that set prices and control what we read, high-quality and reliable input data is a necessary (but not sufficient!) condition for quality outcomes.
This half-day course will cover the impact of data quality issues on data science and statistics work, taxonomies of data quality issues that can occur, a survey of current techniques and tools for issue identification, and how to start including data quality techniques in one’s data science work process.