Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 393 - NLP and Text Analysis
Type: Contributed
Date/Time: Wednesday, August 10, 2022 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #322636
Title: Application of Medical Concept Embeddings and Evaluation to ICD-10 Diagnosis Codes
Author(s): Meghan Beckowski* and Nader Karamzadeh
Companies: Deloitte Consulting, LLC and Deloitte Consulting, LLC
Keywords: healthcare; NLP; dimension reduction; embeddings; diagnosis codes
Abstract:

Medical Concept Embeddings (MBE) has emerged in the literature as a feature reduction technique to compress the sparse space of healthcare diagnosis codes into a smaller subset of features. Originating in the natural language processing (NLP) domain, this technique relies on a neural network to learn associations in a large corpus—in this case, the co-occurrence of diagnosis codes in healthcare claims data. In this work, we introduce a novel application of MCEs using ICD-10 diagnosis codes from a large dataset of healthcare claims. We illustrate our framework and methodology which includes testing word2vec and doc2vec with multiple evaluation criteria. We then demonstrate the intrinsic value of resulting embeddings along multiple evaluation criteria, including T-SNE plots, RAND indices, Normalized Mutual Information (NMI), and discuss the implications and use for deep learning models.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program