Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 75 - Contributed Poster Presentations: Biometrics Section
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Biometrics Section
Abstract #312934
Title: Mining for Health: A Comparison of Word Embedding Methods for Electronic Health Records
Author(s): Emily Getzen* and Qi Long
Companies: University of Pennsylvania and University of Pennsylvania
Keywords: EHRs; NLP; Word Embedding

The integration of statistical and machine learning methods for the analysis of electronic health records (EHRs) is making it possible to more accurately predict diagnoses for patients. One way to do so is through word embedding, which represents words as vectors of real numbers while also capturing and preserving word relationships and semantic and syntactic similarities. There exists a wide variety of word embedding tools such as Word2Vec, BERT, fastText, USE, and GloVe, and there has been limited work on comparing their performance when it comes to using them on EHRs. We extend the word embedding tools to embed a patient’s entire medical history, and use the resultant embeddings to build prediction models for medical events. We assess performance in terms of predictive accuracy using the Medical Information Mart for Intensive Care (MIMIC) database.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program