Proportional reporting ratios (PRRs) have long been utilized as a standard statistical measure for determining the likelihood of an event given a patient's exposure to a medication. This simple statistic is foundational to the science of pharmacovigilance, and its use is well understood by safety scientists who would also like to be able to apply it to social media data in order to enhance pharmacovigilance efforts.
However, given the vast amount of social media data available, manual evaluation and curation of these types of data is resource intensive and must be supplemented by more automated methods such as text classification and machine learning methods. This gives rise to an inherent bias in the application of PRR like analysis to this type of data due to the underlying error prevalent among most classification methods. While early results have shown similar outcomes of events when using PRR on social media data, its use should be taken with some caution.
Here we will show how the impact of aggregate statistics combined with automated classification may lead to inaccurate reporting of PRRs when used with social media data.
|