West Coast Ballroom
Machine Learning for Medical Coding in Health Care Surveys (307830)
Peter Baumgartner, RTI InternationalChristine Carr, RTI International
Rob Chew, RTI International
*Emily Hadley, RTI International
Jason Nance, RTI International
David Plotner, RTI International
Aerian Tatum, HealthCare Resolution Services
Rita Thissen, RTI International
Keywords: text analysis, natural language processing, machine learning, medical coding
Manually coding free-form text responses in surveys can be a time-intensive and expensive process. For health care surveys, the process of medical coding is particularly complex due to the need for medical domain knowledge, the varying quality of clinical notations, and the large number of classification codes. Given the challenges posed to medical coders and the constraints placed on statistical agencies to develop high-quality estimates within budget, machine learning techniques offer potential gains in both efficiency and quality.
In this talk, we explore a machine learning approach for assigning medical codes to clinical verbatim text found in medical records for patient visits from the 2016 and 2017 National Ambulatory Medical Care Survey (NAMCS) and the National Hospital Ambulatory Medical Care Survey – Emergency Department (NHAMCS-ED). We discuss the process of creating machine learning models, evaluating the performance of a benchmark model, and potential use cases. While the current work suggests that models still underperform compared to trained medical coders for this difficult task, creative human-augmented solutions may benefit the manual coding process.