Online Program Home
My Program

Abstract Details

Activity Number: 256 - Contributed Poster Presentations: Section on Statistical Learning and Data Science
Type: Contributed
Date/Time: Monday, July 29, 2019 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #305372
Title: A Natural Language Processing Algorithm for Medication Extraction from Electronic Health Records Using the R Programming Language: MedExtractR
Author(s): Hannah L Weeks* and Cole Beck and Elizabeth McNeer and Joshua C Denny and Cosmin A Bejan and Leena Choi
Companies: Vanderbilt University and Vanderbilt University Medical Center and Vanderbilt University and Vanderbilt University and Vanderbilt University and Vanderbilt University Medical Center
Keywords: electronic health records; natural language processing; medication extraction; R
Abstract:

Electronic health records (EHRs) are a rich source of data for clinical research, if the information can be extracted accurately and efficiently. Medication information can be critical for research on drug exposure response relationship to develop strategies to improve patient treatment. We describe a natural language processing (NLP) system developed using R called medExtractR to extract medication information such as strength, dose amount and frequency from clinical notes. We evaluated medExtractR using tacrolimus and lamotrigine, whose prescribing patterns ranged from simple to highly complex, and compared with three existing NLP systems: MedEx, MedXN, and CLAMP. MedExtractR achieved high recall/precision/F1 for each drug on 60 training notes (tacrolimus: 1.00/.991/.996, lamotrigine: .975/.996/.986) and 50 test notes (tacrolimus: .992/.972/.982, lamotrigine: .967/.983/.975). This outperformed all three existing NLP systems with respect to test set F1 for tacrolimus/lamotrigine (MedEx: .766/.888; MedXN: .943/.865; CLAMP: .743/.803). Our results suggest medExtractR as a better method for medication dose extraction, ultimately leading to higher quality EHR-based research datasets.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program