Abstract:
|
Decision and regression tree ensembles such as random forest and gradient boosted trees are considered as the state of art algorithms for tabular data. They are also kernel generators where the tree ensemble kernel is obtained via the proximity matrix. It has been shown that tree-based kernels not only provide a theoretical framework for analysis of the tree ensembles, but they also perform well in simulations and across real-life data sets. On the other hand, kernel fusion has been shown to be potentially beneficial in kernel learning as it may lead to improved performance. Since tree-based kernels are Mercer kernels, they can be linearly combined, and the resulting aggregated Mercer kernel can be in turn used for kernel learning. The focus of our work is an investigation into the fusion of tree-based kernels. We evaluate the performance of the tree-based kernel aggregation methods in a systematic simulation study. In addition, we elucidate the kernel-target alignment of the fused tree-based kernels and investigate the utility of tree-based kernel fusion on real life examples. Finally, we discuss further non-linear kernel fusion via a tomographic kernel fusion process.
|