This is the program for the 2010 Joint Statistical Meetings in Vancouver, British Columbia.

Abstract Details

Activity Number: 681
Type: Contributed
Date/Time: Thursday, August 5, 2010 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract - #308071
Title: Distribution of Statistics of h-GAP Clusters for a Collection of Words
Author(s): Deidra Andrea Coleman*+ and Donald E.K. Martin
Companies: North Carolina State University and North Carolina State University
Address: Campus Box 8203, Raleigh, NC, 27695,
Keywords: auxiliary Markov chain ; clump counting ; h-gap clusters ; coarsest partition
Abstract:

We give a recursive method for computing the exact distribution of the number of h-gap clusters of a collection of words and their coverage. An h-gap cluster begins with a word of the collection and continues while there is no gap between word occurrences that is greater than h. When h=-1, the words of a cluster must overlap, and the cluster is called a clump. The methods facilitate the computation of p-values for testing procedures. A word may contain other words of the collection, making the computation more general. The underlying sequence is assumed to be Markovian of an arbitrary order. Probabilities are updated recursively using an auxiliary Markov chain, making the computations easy to implement. Automata theory is used to help limit the number of states of the auxiliary chain. The methodology is applied to pattern discovery in DNA sequences.


The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2010 program




2010 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.