JSM Preliminary Online Program
This is the preliminary program for the 2006 Joint Statistical Meetings in Seattle, Washington.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2006 Program page




Activity Number: 324
Type: Topic Contributed
Date/Time: Tuesday, August 8, 2006 : 2:00 PM to 3:50 PM
Sponsor: Section on Statisticians in Defense and National Security
Abstract - #305889
Title: Graphs for Streaming Text
Author(s): Elizabeth Hohman*+
Companies: Naval Surface Warfare Center
Address: Code B10, Dahlgren, VA, 22448,
Keywords: streaming text
Abstract:

There are several obstacles to analyzing streaming text data that do not arise when the corpus of documents is static. When employing a vector space model for text processing, word weights usually are used that depend on the frequency of the word in the document and the frequency of the word in the corpus. Such word weighting must be revised to process a streaming collection of documents. This presentation discusses methods for word weighting streaming text and methods for representing a changing corpus with a dynamic graph. An example corpus is used that contains daily news articles from five categories. Graphs are used to represent the streaming corpus. Statistics on the graphs are calculated in order to determine changes in the corpus.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2006 program

JSM 2006 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised April, 2006