Abstract:
|
Pattern discovery is widely used to analyze market data. To date, the focus has been on frequent patterns. However, in applications such as detecting anomalies in computer networks and identifying security intrusions, frequent patterns characterize normal behavior, which is not of interest in these domains. Rather, the interest is in patterns that proceed malfunctions or other undesirable situations. Such patterns are characterized by items that co-occur with high probability, especially long, infrequent patterns (since these provide better predictive capabilities). Unfortunately, defining infrequent patterns in terms of the probability of item co-occurrence yields neither upward nor downward closure, and hence efficient algorithms cannot be constructed. Herein, we circumvent this problem by proposing fully dependent patterns (d-patterns), defined so that all subsets of a d-pattern are also d-patterns, a condition ensures downward closure. We develop a statistical test to qualify d-patterns, and construct an efficient algorithm for their discovery. We apply our algorithm to data from a real network and show that several patterns of interest are discovered.
|