Online Program Home
My Program

Abstract Details

Activity Number: 686
Type: Topic Contributed
Date/Time: Thursday, August 4, 2016 : 10:30 AM to 12:20 PM
Sponsor: Social Statistics Section
Abstract #320018
Title: Cascaded High-Dimensional Histograms and an Application to Criminology
Author(s): Siong Thye Goh* and Cynthia Rudin
Companies: MIT and Duke University
Keywords: Criminology ; Cascaded Histogram ; Nonparametric Density Estimation ; Interpretable Models ; Density List ; Generative Model
Abstract:

Understanding the distribution of the types of crime in a region enables us to analyze crime effectively. To do this, we propose a new nonparametric density estimation method that yields models that are interpretable to human experts. We present tree- and list- structured density estimation methods for high dimensional categorical data. The density is constant in each leaf, similar to the flat density within the bin of a histogram. Histograms, however, cannot easily be visualized in high dimensions, whereas our models can. The accuracy of histograms fades as dimensions increase, whereas our models have priors that help with generalization. Our models are sparse, unlike high-dimensional histograms. We present three generative models. The first one allows the user to specify the number of desired leaves in the tree within a Bayesian prior. The second model allows the user to specify the desired number of branches within the prior. The third model returns lists and allows the user to specify the desired number of rules and the length of rules within the prior. Our results show that the new methods yield a better balance between sparsity and accuracy.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association