Abstract:
|
Statistics Without Borders (SWB) is partnering with Montgomery County, MD Community Emergency Response Team (MCCERT) to address the current COVID-19 pandemic in the United States. Using a methodological framework developed by Peterson et al. (2019) and recently applied in the National Capital Region using George Mason University's streaming analytics system, Citizen Helper, a team of eight SWB volunteers launched a similar effort on the West Coast in Palo Alto CA and surrounding area. Utilizing a variety of web scraping and data filtering methods, they gathered targeted Twitter data based on predefined keywords related to geographic location, prevention, symptoms, and risks of COVID-19. Together, the SWB team developed a host of predictive machine learning models to classify out-of-sample tweets by relevance (high, medium, low, or irrelevant) to the needs of MCCERT. Further experimentation was done with semi-supervised approaches to explore, improve, and extend the classification scheme.
|