Activity Number:
|
389
- Novel Data Collection Strategies in Business and Economics
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 10, 2022 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Business and Economic Statistics Section
|
Abstract #323498
|
|
Title:
|
Extracting Product Innovation Information from Unstructured Data
|
Author(s):
|
Neil Kattampallil* and Nathaniel Ratcliff and Aritra Halder and Gary Anderson and John Jankowski and Gizem Korkmaz
|
Companies:
|
University of Virginia - Biocomplexity Institute and Initiative and University of Virginia - Biocomplexity Institute and Initiative and University of Virginia and National Center for Science & Engineering Statistics, National Science Foundation and National Center for Science & Engineering Statistics, National Science Foundation and Coleridge Initiative
|
Keywords:
|
Natural Language Processing;
Machine Learning;
BERT;
text-based data;
News articles;
innovation
|
Abstract:
|
Imagine trying to identify innovation in the market from the billions of gigabytes of data created globally every day. A significant portion of this data consists of unstructured text from blog posts and news articles, tweets, social media, and medical and administrative reports. Automating the process of extracting useful information from unstructured data has been a challenge because this text data often contains nuances that humans clearly understand but are difficult for machines to process. To explore the feasibility of detecting innovation in new products from text, we are using powerful language models and improved Natural Language Processing (NLP) methods. This presentation describes and presents our results using a Question-Answering system to extract the company and product names from unstructured articles
|
Authors who are presenting talks have a * after their name.