Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 205 - Applications of Machine Learning
Type: Contributed
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #312785
Title: Summarizing and Extracting Insights from Consumer Review Data
Author(s): Jingting Hui* and Jason Parcon
Companies: PepsiCo and PepsiCo
Keywords: Natural Language Processing; Sentiment Analysis; LDA Topic Modeling; AllenNLP; Cluster Analysis; Open-Ended

Obtaining input from target consumers is an important part of the product development process in the food & beverage industry. The inputs are either quantitative or qualitative, and the latter could take the form of open-ended consumer reviews. It is important to obtain accurate insights from open-ended reviews as quantitative data provides insights only on pre-determined product attributes. This paper discusses how techniques like sentiment analysis, topic modeling, and cluster analysis can be used to obtain insights from open-ended reviews. The goal is to not only identify the probability a consumer likes or dislikes a product, but also identify the factors that determine consumers’ opinion on the product. For this presentation, verbatims were collected from social media and e-commerce as example. They were initially analyzed using unsupervised learning with the pre-trained model from AllenNLP to identify sentiment probabilities. Two approaches of extracting latent topics were then compared: LDA topic modeling vs. traditional cluster analysis. Finally, a pre-trained BERT model was used to identify similar consumer reviews in addition to results obtained from LDA topic modeling.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program