Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 389 - Novel Data Collection Strategies in Business and Economics
Type: Contributed
Date/Time: Wednesday, August 10, 2022 : 8:30 AM to 10:20 AM
Sponsor: Business and Economic Statistics Section
Abstract #323498
Title: Extracting Product Innovation Information from Unstructured Data
Author(s): Neil Kattampallil* and Nathaniel Ratcliff and Aritra Halder and Gary Anderson and John Jankowski and Gizem Korkmaz
Companies: University of Virginia - Biocomplexity Institute and Initiative and University of Virginia - Biocomplexity Institute and Initiative and University of Virginia and National Center for Science & Engineering Statistics, National Science Foundation and National Center for Science & Engineering Statistics, National Science Foundation and Coleridge Initiative
Keywords: Natural Language Processing; Machine Learning; BERT; text-based data; News articles; innovation
Abstract:

Imagine trying to identify innovation in the market from the billions of gigabytes of data created globally every day. A significant portion of this data consists of unstructured text from blog posts and news articles, tweets, social media, and medical and administrative reports. Automating the process of extracting useful information from unstructured data has been a challenge because this text data often contains nuances that humans clearly understand but are difficult for machines to process. To explore the feasibility of detecting innovation in new products from text, we are using powerful language models and improved Natural Language Processing (NLP) methods. This presentation describes and presents our results using a Question-Answering system to extract the company and product names from unstructured articles


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program