JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 105
Type: Invited
Date/Time: Monday, August 4, 2014 : 8:30 AM to 10:20 AM
Sponsor: Social Statistics Section
Abstract #310913 View Presentation
Title: Reducing Sampling Bias in Social Media Data for County Health Inference
Author(s): Aron Culotta*+
Companies: Illinois Institute of Technology
Keywords: social media ; health ; natural language processing ; reweighting
Abstract:

A number of recent studies have demonstrated the utility of social media data for inferring societal attributes such as public opinion and health. A commonly declared limitation of this methodology is the selection bias inherent in this approach -- social media users are a non-representative sample of the population. This is exacerbated by filtering steps that further limit the sample set in biased ways. Building on recent work in computational linguistics that infers demographic attributes of people based on their communications, we investigate methods to quantify and control for selection bias in social media studies. We present results estimating several county-level health statistics (e.g., obesity, diabetes, access to healthy foods) based on the Twitter activity of the top 100 counties in the U.S., and we compare strategies for reducing selection bias.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.