Online Program Home
  My Program

Abstract Details

Activity Number: 59 - Evidence-Generation via Big Data in the Real-World Setting
Type: Topic Contributed
Date/Time: Sunday, July 30, 2017 : 4:00 PM to 5:50 PM
Sponsor: Stats. Partnerships Among Academe, Indust. & Govt. Committee
Abstract #323048
Title: Automated Learning Techniques for Electronic Health Record (EHR) Unstructured Notes
Author(s): Michael Sanky and Balaji Ramesh*
Companies: Optum and Optum
Keywords: Electronic Health Record ; Natural Language Processing ; provider notes ; automated learning

Automated Learning Techniques for Electronic Health Record (EHR) It is well-known that Electronic Health Records (EHRs) are a rich source of information about patient care, much of which is unavailable anywhere else in the patient record. Often this information is mined in a top-down fashion, where notes are segregated by disease cohorts and a bootstrap process is initialized with a priori knowledge about the conditions, treatments, and observations that are likely to be associated with the disease. In this talk, we will describe processes we have developed to perform this task in a bottom-up method, for rare diseases still lacking in the international classification of diseases (ICD) codes and whose descriptors are not standardized. Making no assumptions about the language we will find, we use a variety of statistical natural language processing techniques to discover latent topics and word groupings within sets of EHRs. We will describe how we make use of existing, related ICD codes and other structured data to come up with test and control cohorts to start the process, and the filters that are needed to insure that our learning algorithms see the most relevant patient data.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association