Activity Number:
|
132
- SLDS CSpeed 1
|
Type:
|
Contributed
|
Date/Time:
|
Monday, August 9, 2021 : 1:30 PM to 3:20 PM
|
Sponsor:
|
Section on Statistical Learning and Data Science
|
Abstract #317931
|
|
Title:
|
On Kernel-Target Alignment and Relevant Dimensions in Kernel Feature Spaces Ensuing from the Decision and Regression Tree Ensembles
|
Author(s):
|
Dai Feng* and Richard Baumgartner
|
Companies:
|
AbbVie Inc. and Merck Research Laboratories
|
Keywords:
|
tree ensemble kernel;
kernel-target alignment ;
relevant dimensions;
prototype;
visual diagnostic/comparison
|
Abstract:
|
Decision and regression tree ensembles such as random forest (RF) and gradient boosted trees (GBTs) can be interpreted as implicit kernel generators with the proximity matrix representing the tree ensemble kernel. Our focus is on investigation of kernel-target alignment and relevant dimensions for the RF and GBTs. We built on earlier results that proposed a natural characterization of the kernel-target alignment by the means of scalar products between the target and singular vectors of the kernel matrix. We elucidate the kernel-target alignment and relevant dimensions ensuing from the RF and GBT kernels in a comprehensive simulation study and real-life data sets. We also carried out a sensitivity analysis using reference prototype sets of varying size. Moreover, the accompanied plots furnish a useful visual diagnostic/comparison of the kernels. Overall, we demonstrate that kernel-target alignment concentrated in a small number of strongly aligned relevant dimensions is linked to a competitive performance of the tree ensemble-based kernels. Our results can be generalized to any ensemble algorithm that produces partitioning of the feature space which enables kernel construction.
|
Authors who are presenting talks have a * after their name.