St. James Ballroom
Extracting Information from Clinical Notes Using Natural Language Processing (303905)
Ning Smith, Kaiser Permanente Center for Health Research*Qing Zhou, Kaiser Permanente Center for Health Research
Keywords: natural language processing, electronic health records, unstructured data, Canary
Current healthcare research uses primarily the structured data, i.e., the coded parts of the electronic health record (EHR) for case detection, which may lead to missed cases or biased findings. An enormous amount of information exists in the unstructured free text such as clinical notes. Classical natural language processing (NLP) technology encounters significant challenges in processing information from clinical notes. In addition to the noise inherent in EHR, most NLP algorithms require large training data sets with ground truth which are very costly to obtain. Another major barrier is the need for medical researchers to acquire the programming skills to develop customized NLP algorithms or to use existing NLP methods. We seek to demonstrate a solution to these challenges by using Canary, a free and open source software, to identify patients who developed local reactions after vaccination in a small study population (69 patients). Our results show that NLP is successfully employed to extract the outcome data from clinical notes with high accuracy while adequately handling negations and medical context/terminologies, and misspelling.