Abstract:
|
Automated Learning Techniques for Electronic Health Record (EHR) It is well-known that Electronic Health Records (EHRs) are a rich source of information about patient care, much of which is unavailable anywhere else in the patient record. Often this information is mined in a top-down fashion, where notes are segregated by disease cohorts and a bootstrap process is initialized with a priori knowledge about the conditions, treatments, and observations that are likely to be associated with the disease. In this talk, we will describe processes we have developed to perform this task in a bottom-up method, for rare diseases still lacking in the international classification of diseases (ICD) codes and whose descriptors are not standardized. Making no assumptions about the language we will find, we use a variety of statistical natural language processing techniques to discover latent topics and word groupings within sets of EHRs. We will describe how we make use of existing, related ICD codes and other structured data to come up with test and control cohorts to start the process, and the filters that are needed to insure that our learning algorithms see the most relevant patient data.
|