Bayesian network (BN) classifiers are powerful tools to model complicated variable dependence for classification purposes. In this work, we empirically evaluate the following five BN classifiers:
1. naïve Bayes; 2. tree-augmented naïve Bayes; 3. BN-augmented naïve Bayes; 4. parent-child BN; 5. Markov blanket BN.
Two network learning approaches are used for modeling network structures including score-based approach and constraint-based approach. The score-based approach uses a score function (e.g., BIC score) to compare network structures. The constraint-based approach uses independence tests to determine the edges and the directions.
We also compare the classification accuracy of these BN classifiers to other popular classification methods, such as classification tree, neural network, and logistic regression.
Our evaluation uses SAS software on simulation data sets. The SAS HPBNET procedure is used to implement all the five Bayesian network classifiers. The computing power of the HPBNET procedure is also demonstrated on a huge dataset of 150 million observations that is stored on a Cloudera hadoop server.
|