Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 498 - Modern Machine Learning
Type: Contributed
Date/Time: Thursday, August 6, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #309672
Title: Random Forest Kernels: Utility and Insights for Interpretable Statistical Learning
Author(s): Dai Feng* and Richard Baumgartner
Companies: AbbVie and Merck
Keywords: random forests; kernel methods; Gaussian process; prototype

Breiman’s random forest can be interpreted as an implicit kernel generator, where the ensuing proximity matrix represents the data-driven kernel. Under mild assumptions it can be shown that this kernel asymptotically approaches a Laplacian kernel. Furthermore, it has been recently also shown that the Laplacian kernel underlies other tree-based ensembles such as Mondrian forest or BART. Kernel perspective of random forests has been used to develop a principled framework for theoretical investigation of statistical properties of random forests. However, practical utility of the links between kernels and random forests has not been widely explored.

Focus of our work is investigation of the interplay between kernel methods and random forests. We elucidate the properties of the data driven random forest kernels in a simulation study of continuous, binary and survival outcomes. We also give a real-life example to show of how these insights may be leveraged in practice. Finally, we discuss further extensions of the random forest kernels in context of Gaussian process prediction and interpretable prototypical regression.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program