Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 445 - GOVT CSpeed 2
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 4:00 PM to 5:50 PM
Sponsor: Government Statistics Section
Abstract #318498
Title: Mining Federal RePORTER Using Machine Learning: Selected Case Studies on the Popularity of Concept-Related Topics
Author(s): Kathryn Linehan* and Eric Oh and Joel Thurston and Stephanie S. Shipp and Sallie Keller and John Jankowski and Audrey Kindlon
Companies: University of Virginia, Biocomplexity Institute and Biocomplexity Institute, University of Virginia and University of Virginia, Biocomplexity Institute and University of Virginia, Biocomplexity Institute and Biocomplexity Institute, University of Virginia and National Center for Science and Engineering Statistics and National Center for Science and Engineering Statistics
Keywords: machine learning; topic modeling; information retrieval; statistical applications; administrative data; data insights
Abstract:

Gleaning insights from large amounts of data has become crucial in today’s world. In this research, we present insights from Federal RePORTER, a database of federally funded Research and Development (R&D) grants. By mining project abstracts, we create a concept-themed corpus (e.g., abstracts related to pandemics) through the use of term-matching and latent semantic indexing (LSI) and then use non-negative matrix factorization (NMF) topic modeling to discover concept-related topics in that corpus. We analyze these topics over time to find when topics increase and decline in popularity and include a sensitivity analysis of these trends. We show that the results of our topic analysis over time correspond to past events that would affect the rise or fall in popularity of particular topics. Stability results for the topic model are included as well. We will present selected case studies on concepts such as pandemics, coronavirus, and artificial intelligence.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program