Abstract:
|
Topological Data Analysis (TDA) is an attempt to apply topological concepts of 'shape' to data clouds by finding clusters, holes, tunnels, etc. Specifically, the well known statistical technique, cluster analysis, is a very simple special case of TDA. The main method unique to TDA is to encode the persistent homology of a data set in the form of a parameterized version of a Betti number which is called a persistence diagram or Betti barcode. I propose to use currently available packages in R such as diffusionMap, randomForest, ggplot2, and phom(persistent homology) to apply TDA to the Census Bureau's 2014 Planning Database at the tract level. The initial step in the proposed TDA research will be to apply statistical 'lenses' such as random forests, distance matrices and possibly Principal Components Analysis to produce metrics that can be used as proximity measures. The low dimensional structure produced by these metrics can be viewed through plots such as from a diffusion matrix. Finally, the 'persistent' structure will be investigated using Betti barcodes.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.