Online Program Home
My Program

Abstract Details

Activity Number: 176 - Bayesian Mixture Modeling, Clustering and Unsupervised Learning
Type: Contributed
Date/Time: Monday, July 29, 2019 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #305194
Title: A Bayesian Nonparametric Approach to Clustering Data at Multiple Resolutions
Author(s): Cecilia Balocchi* and Shane T. Jensen
Companies: University of Pennsylvania and University of Pennsylvania
Keywords: multi-resolution clustering; tree hierarchies; Dirichlet process; nested process; hierarchical process

We consider the problem of clustering data at multiple resolutions, when the different units are organized in a hierarchy that can be described by a tree: higher resolution entities are nested within lower resolution ones. A motivating example is the modeling of crime in urban environments at different spatial resolutions: US cities are divided into census tracts, which are divided into census block groups, which are further split into census blocks. We want to partition a city into regions with similar crime frequencies at each resolution while sharing information between partitions at different resolutions. The Dirichlet Process allows to partition data when the number of clusters is unknown. If we knew the partition at higher levels, such as the census tract level, Hierarchical Dirichlet Processes would be an appropriate model. Nested Dirichlet Processes instead allow to model partitions at multiple levels but would restrict block group clusters to be nested into census tract ones. In this work we combine Nested and Hierarchical Dirichlet Processes, to allow for more flexible partitions that do not have this constraint. We apply this method to crime frequencies in Philadelphia.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program