Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 132 - SLDS CSpeed 1
Type: Contributed
Date/Time: Monday, August 9, 2021 : 1:30 PM to 3:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #317931
Title: On Kernel-Target Alignment and Relevant Dimensions in Kernel Feature Spaces Ensuing from the Decision and Regression Tree Ensembles
Author(s): Dai Feng* and Richard Baumgartner
Companies: AbbVie Inc. and Merck Research Laboratories
Keywords: tree ensemble kernel; kernel-target alignment ; relevant dimensions; prototype; visual diagnostic/comparison
Abstract:

Decision and regression tree ensembles such as random forest (RF) and gradient boosted trees (GBTs) can be interpreted as implicit kernel generators with the proximity matrix representing the tree ensemble kernel. Our focus is on investigation of kernel-target alignment and relevant dimensions for the RF and GBTs. We built on earlier results that proposed a natural characterization of the kernel-target alignment by the means of scalar products between the target and singular vectors of the kernel matrix. We elucidate the kernel-target alignment and relevant dimensions ensuing from the RF and GBT kernels in a comprehensive simulation study and real-life data sets. We also carried out a sensitivity analysis using reference prototype sets of varying size. Moreover, the accompanied plots furnish a useful visual diagnostic/comparison of the kernels. Overall, we demonstrate that kernel-target alignment concentrated in a small number of strongly aligned relevant dimensions is linked to a competitive performance of the tree ensemble-based kernels. Our results can be generalized to any ensemble algorithm that produces partitioning of the feature space which enables kernel construction.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program