Online Program

Return to main conference page
Saturday, May 19
Machine Learning
Machine Learning for Complex Data
Sat, May 19, 10:30 AM - 12:00 PM
Grand Ballroom D
 

Regression Trees and Ensemble Methods for Multivariate Outcomes (304564)

Mousumi Banerjee, University of Michigan 
*Evan Lee Reynolds, University of Michigan 

Keywords: Statistical Learning, Classification and Regression Trees, Multivariate Outcomes, Ensemble Classifiers

Tree-based methods have become one of the most flexible, intuitive, and powerful data analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariate outcomes occur commonly (e.g. diastolic and systolic blood pressure, periodontal measures in dental health studies, and nerve density measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in the outcomes. In this paper, we develop three goodness of split measures for multivariate tree building for continuous outcomes. The proposed measures utilize specific aspects of the variance-covariance matrix at each potential split. Additionally, to enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of trees. Extensive simulations are presented to examine the properties of our goodness of fit measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.