Abstract:
|
To cope with the enormous number of variables (in the magnitude of thousands) that can affect system performance, we introduce a three-phase pattern detection technique. In Phase 1, variable clustering is applied to group correlated variables together; and only one variable is chosen from each cluster. In Phase 2, regression trees are generated for each parameter. In addition, a predictability graph is formed based on the regression trees. In phrase 3, a directed graph is generated with all variables selected in phase 2. A transitive closure based algorithm is designed to select parameters that "cover" all the other parameters. The result is a set of variables most sensitive to changes in the system resources based on the variable importance indicator. We can then use the directed graph to study patterns related to the system performance. To assess the effectiveness of this approach, we implement this technique, and use it to analyze different configurations of a TPC-C (Transaction Processing Performance Council benchmark) database system. The experimental results indicate the effectiveness of the proposed technique.
|