JSM Preliminary Online Program
This is the preliminary program for the 2009 Joint Statistical Meetings in Washington, DC.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2009 Program page




Activity Number: 72
Type: Contributed
Date/Time: Sunday, August 2, 2009 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistical Computing
Abstract - #304075
Title: Distribution of Clump Statistics for a Collection of Words
Author(s): Deidra A. Coleman*+ and Donald E.K. Martin
Companies: North Carolina State University and North Carolina State University
Address: Campus Box 8203, Raleigh, NC, 27695-8203,
Keywords: clump counting ; coarsest partition ; clusters
Abstract:

We give a recursive method for computing the exact distribution of the number of occurrences and coverage of clumps (maximal sets of overlapping words) and h-gap clusters (word clusters where gaps of length no more than are allowed) of a collection of words. The underlying sequence is assumed to be Markovian of an arbitrary order. An auxiliary Markov chain is formed to simplify the computations, with a "coarsest partition" of state components that indicate progress into clumps used to help limit the number of states of the Markov chain. The methods are applied to pedagogic examples and pattern discovery in DNA sequences.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2009 program


JSM 2009 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised September, 2008