634 – Applications of Text Analysis in the U.S. Government
Tracking Disease Outbreaks Using Twitter
David Marchette
Naval Surface Warfare Center
Elizabeth Hohman
Naval Surface Warfare Center
Social media applications such as Twitter can be used to detect and track disease outbreaks in near real-time, provided that 1) the data can be collected, 2) the person providing the information can be geographically located, 3) it can be determined whether the person (or someone geographically close to them) is sick, and 4) the "sick" individuals can be identified as suffering from the same disease or at least have similar symptoms. We discuss our efforts in detection and tracking using Twitter data collected from January 2013 to the present, and discuss the various issues that arise in using Twitter data. In particular, we will discuss various keyword and topic-based methods, as well as methods for classifying a tweet or a user as "sick". We discuss some of our successes and failures and provide some insight into the utility and limitations of micro-blog data such as that provided by Twitter. We will discuss variations on the basic surveillance theme such as watching for a known disease (measles), a known set of symptoms (fever, stomach ache), and the more general (and difficult) problem of detecting an unusual number of sick individuals within a constrained geographic region (county).