Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 48 - Addressing Challenges in Teaching Statistics in the Health Sciences
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Teaching of Statistics in the Health Sciences
Abstract #313622
Title: Setting up the Tools and Workflow for Teaching Reproducible Research, Big Data and Data Mining in Nursing and Public Health
Author(s): Melinda Higgins*
Companies: Emory University
Keywords: education; reproducibility; RStudio; Github; rmarkdown; data science
Abstract:

To capitalize on the explosion of health data, big data computing platforms and data mining are critical for nursing and public health scientists. Reproducible workflows are also requirements in today’s open science calls for transparency. To address these needs, we have been teaching a course on “Big Data Analytics for Healthcare” for the past three years which teach these foundational skills. This presentation will provide a checklist and instructions for other instructors to follow to set up similar courses using the open source software tools of R and RStudio on the RStudio cloud platform as well as code and data sharing and version control using Git on the Github cloud platform. Different workflow approaches will be detailed and compared with pros and cons discussed. Lessons learned from both instructor and student perspectives will be presented. Exemplars of student projects using statistical modeling and data mining using these skills and workflows will be presented such as microbiome data analysis, web-scraping analysis of social-media blogs, text mining of electronic medical records, and applications of classification and regressions trees and random forests.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program