Friday, February 24
CS11 Text Analytics Fri, Feb 24, 2:00 PM - 3:30 PM
City Terrace 7

Using Text Analytics and Signal Detection to Predict Medical Device Recalls (303385)

*Lisa Ensign, Significant Statistics 

Keywords: Machine learning, text mining, signal detection, disproportionality analysis, unstructured data, medical device, MAUDE, spontaneous reporting systems, class-imbalanced data

In the last decade there has been an alarming rise both in the reporting of medical device problems and in recalls of medical devices that were found to have a significant risk of causing serious injury or death in patients. The US Food and Drug Administration (FDA)’s Manufacturer and User Facility Device Experience (MAUDE) database is a spontaneous reporting system (SRS) providing over 4 million medical device adverse event and product problem reports dating back to 1991. However, most MAUDE data is unstructured, and reports of interest are typically received intermittently over periods spanning years, or even decades, of a product’s history. Text analytics makes possible the use of this complex data, both to classify records - such as those associated with a device recall – as well as to utilize narrative information describing product problems. This presentation highlights some of the challenges and opportunities of cleaning and readying text data for analysis; discusses the training and testing of machine learning algorithms, including consideration of class-imbalanced data; and uses signal detection data mining methods to systematically assess temporal trends in the resulting text mining predictions. While the examples focus on medical devices, the approach is applicable to product reports found in SRS, warranty, consumer complaint and social media databases.