SAPTrees: Using Conditional Inference Trees to Characterize Heterogeneity in Human Activity Patterns (307914)
Julian Wolfson, University of Minnesota
Keywords: sensor techonology, conditional inference trees, multivariate distance matrix regression
Smartphone sensor technology has revolutionized our ability to collect objective data on human activity, allowing researchers to quantify human activity and behavior with unprecedented accuracy and resolution. This paper focuses on the problem of characterizing human activity patterns inferred from smartphone sensor data. Viewing each individual's data as a sequence, and building on techniques for sequence alignment in genomics, we propose the Sequential Activity Pattern Tree (SAPTree) for characterizing population subgroups that are homogeneous with respect to their activity sequences. Our method follows a two-step approach, first calculating a pairwise sequence distance matrix, followed by a novel implementation of conditional inference trees (CTrees) that uses multivariate distance matrix regression in determining splits. One key benefit of our method is that, it controls the Type I error rate, i.e., the probability of detecting heterogeneous subgroups when none exist. We present visualizations which can aid in interpreting output from sequence-based decision trees, and apply our method to human activity data collected via smartphone sensors in a recent Minneapolis-area study.