Abstract:
|
Least squares regression trees, alive with ad hoc approaches and hueristic justifications, have been widely studied and well-implemented in many automated statistical methods. This paper puts forward a method of constructing optimally-sized trees via the maximum likelihood method. We adopt the CART (Brieman et al., 1984) idea of first growing a large tree, pruning it back to a subtree sequence, and then selecting the best-sized tree. However, different methods for splitting, pruning, and size selection are developed to this end. The whole procedure can be strictly and straightforwardly justified with standard likelihood-typed arguments. Moreover, the proposed method has easy extension to analyze binary/categorical responses, count data, or even censored survival times. Simulated experiments show that the ML regression trees dramatically improve the accuracy in structure detection when compared with CART. Practical examples are also given as illustration.
|