Tuesday, January 7

Tue, Jan 7, 9:00 AM - 10:45 AM
West Coast Ballroom

Statistical Learning Methods for Health Care Innovation

Machine Learning for Medical Coding in Health Care Surveys (307830)

Peter Baumgartner, RTI International
Christine Carr, RTI International
Rob Chew, RTI International
*Emily Hadley, RTI International
Jason Nance, RTI International
David Plotner, RTI International
Aerian Tatum, HealthCare Resolution Services
Rita Thissen, RTI International

Keywords: text analysis, natural language processing, machine learning, medical coding

Manually coding free-form text responses in surveys can be a time-intensive and expensive process. For health care surveys, the process of medical coding is particularly complex due to the need for medical domain knowledge, the varying quality of clinical notations, and the large number of classification codes. Given the challenges posed to medical coders and the constraints placed on statistical agencies to develop high-quality estimates within budget, machine learning techniques offer potential gains in both efficiency and quality.

In this talk, we explore a machine learning approach for assigning medical codes to clinical verbatim text found in medical records for patient visits from the 2016 and 2017 National Ambulatory Medical Care Survey (NAMCS) and the National Hospital Ambulatory Medical Care Survey – Emergency Department (NHAMCS-ED). We discuss the process of creating machine learning models, evaluating the performance of a benchmark model, and potential use cases. While the current work suggests that models still underperform compared to trained medical coders for this difficult task, creative human-augmented solutions may benefit the manual coding process.

Online Program

Machine Learning for Medical Coding in Health Care Surveys (307830)

American Statistical Association