We present an approach to analyzing a (possibly multi-dimensional) time series by assigning to each pair of adjacent points in the time series a label (symbol) from a finite set. The sequence of labels is constrained by a regular language and possible additional constraints (such as total cost of transitions). The algorithm alternates between (1) finding an optimal-cost (in terms of the error in explaining the observed data) word in the regular language and (2) optimizing the representation of the observed data associated with each of the symbols. The representation for the data (associated with each of the symbols) can be a mean-value (e.g. k-means), an offset and a basis for a subspace, etc. Given appropriate transition constraints, we get a segmentation of the time series, thus solving the speaker diarization problem (partitioning an audio signal into segments and assign each segment to an individual speaker).
While the regular-language constraints are similar to those used in Hidden Markov Models, we do not assume transition probabilities are known, or constant.
We show results on several datasets and discuss extensions.
|