Activity Number:
|
393
- NLP and Text Analysis
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 10, 2022 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #322115
|
|
Title:
|
A Text Mining Approach to Determine Correlations Between the Spanish Flu and COVID-19
|
Author(s):
|
Billie Anderson* and Majid Bani-Yaghoub and Vagmi Kantheti and Scott Curtis
|
Companies:
|
University of Missouri Kansas City and University of Missouri Kansas City and University of Missouri Kansas City and University of Missouri Kansas City
|
Keywords:
|
COVID-19;
Spanish flu;
fulltext R package;
full-text journal articles
|
Abstract:
|
In the past two decades databases and tools to access them in a less data engineering manner have become available, allowing for the merging of historical and modern-day topics to be studied. Throughout the most recent COVID-19 pandemic, many researchers and scientists have reflected on whether there were any lessons learned from the Spanish flu pandemic of 1918 that could be helpful in the present pandemic. This paper attempts to address this question from a text mining approach. Most research studies that use text mining applications rarely use full-text journal articles. Abstracts are mainly used due to their ease of access. This paper presents the methodology used to develop a full-text journal article corpus using the R package fulltext. Fulltext allows for searching across multiple sources and fetching full text articles using search terms. The search terms used to develop the corpus was “Spanish flu” and the synonyms associated with “Spanish flu”. Using 2,243 full-text journal articles, a correlated topic model was applied to the full-text corpus to determine if there were any articles in which Spanish flu and COVID-19 are correlated.
|
Authors who are presenting talks have a * after their name.