Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 440 - SLDS CSpeed 8
Type: Contributed
Date/Time: Thursday, August 12, 2021 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318067
Title: A Modified Bayesian Information Criterion for Improving the Performance of Tree-Based Learning Algorithms Without the Use of Cross-Validation
Author(s): Nikola Surjanovic* and Andrew Henrey and Thomas Loughin
Companies: Simon Fraser University and Finning and Simon Fraser University
Keywords: regression trees; pruning; information criteria; cross-validation; machine learning; random forest
Abstract:

Casting tree building as a change-point detection problem, we show that it is possible to prune a regression tree efficiently using properly modified information criteria, and we discuss some applications to tree-based ensemble learning methods. We prove that one of the proposed pruning approaches using a modified Bayesian information criterion is consistent for identifying the correct tree model when it exists as a subtree within a larger tree. In practice, we obtain simplified trees that can have prediction accuracy comparable to trees obtained using standard cost-complexity pruning. We briefly discuss an extension to random forests that adaptively prunes trees to prevent excessive variance. The extension includes regular random forests as a special case, and is therefore expected to perform at least as well, with a negligible additional computational cost.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program