567 – Computing with Time Series Data
Scan Statistic Distribution Through Slack Equations
Donald Martin
North Carolina State University
Scan statistics are used in various areas of applied probability and statistics to study local clumping of patterns. Testing based on a scan statistic requires tail probabilities. Whereas the distribution of various scan statistics has been studied extensively, most of the results are approximations, due to the difficulties associated with the computation. Results have been given to compute exact p-values for the statistic over a binary sequence that is independent or first-order Markovian. However, in many practical applications, the variables under study take on multiple values, and/or a model with higher-order dependence provides a better fit. The present paper fills this gap by obtaining the distribution of the univariate scan statistic for multi-state trials that are Markovian of a general order of dependence. A deterministic finite automaton is developed to index the computation, and a matrix corresponding to automaton transitions is used to update probabilities. Examples are given to illustrate the algorithm.