JSM 2015

Technical Support

Phone: (410) 638-9239

Fax: (410) 638-6108

GoToMeeting: Meet Now!

Web: www.CadmiumCD.com

Submit Support Ticket

t on the system-->

‹‹ Go Back

David Marchette

Naval Surface Warfare Center Dahlgren Division

ï¿½ï¿½ Go Back

←Back

A Statistical Analysis of a Time Series of Twitter Graphs

Sponsor: Statistics in Defense and National Security Section

Keywords: social media, twitter, geographic inference, analysis of large graphs

David Marchette

Naval Surface Warfare Center Dahlgren Division

In this paper I describe a set of Twitter data that we have been collecting for nearly two years. Using the Twitter streaming API, we collect all tweets geo-located within a set of rectangles covering the main land-masses of the world, as well as tweets containing certain key phrases. We collect "all" geo-located tweets, in the sense that Twitter provides all the tweets that are geo-located within the rectangle, provided the volume does not exceed a fixed limit. These tweets define a "mentions" digraph - each user id is a vertex and there is an edge from s to t if a tweet from s mentions t:@s:"@t u wanna go to lunch?". These mentions digraphs can be computed on time intervals to produce a time series of graphs. These graphs tend to have power law degree distributions, and I will describe the graphs and discuss some thoughts on how one might model these graphs. Using the graphs, I will discuss methods for inferring node attributes, such as the geo- position of a user whose tweet is not geo-located, or detecting spoofed geo-locations.

View Paper

"eventScribe", the eventScribe logo, "CadmiumCD", and the CadmiumCD logo are trademarks of CadmiumCD LLC, and may not be copied, imitated or used, in whole or in part, without prior written permission from CadmiumCD. The appearance of these proceedings, customized graphics that are unique to these proceedings, and customized scripts are the service mark, trademark and/or trade dress of CadmiumCD and may not be copied, imitated or used, in whole or in part, without prior written notification. All other trademarks, slogans, company names or logos are the property of their respective owners. Reference to any products, services, processes or other information, by trade name, trademark, manufacturer, owner, or otherwise does not constitute or imply endorsement, sponsorship, or recommendation thereof by CadmiumCD.

As a user you may provide CadmiumCD with feedback. Any ideas or suggestions you provide through any feedback mechanisms on these proceedings may be used by CadmiumCD, at our sole discretion, including future modifications to the eventScribe product. You hereby grant to CadmiumCD and our assigns a perpetual, worldwide, fully transferable, sublicensable, irrevocable, royalty free license to use, reproduce, modify, create derivative works from, distribute, and display the feedback in any manner and for any purpose.

SUBMIT FEEDBACK

Technical Support

Submit Support Ticket

David Marchette

Please enter your access key

Email This Presentation:

A Statistical Analysis of a Time Series of Twitter Graphs