Abstract Details
Activity Number:
|
543
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 7, 2013 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Statistical Graphics
|
Abstract - #310066 |
Title:
|
Maximum Entropy Summary Trees
|
Author(s):
|
Kenneth Shirley*+ and Howard Karloff
|
Companies:
|
AT&T Labs and AT&T Labs
|
Keywords:
|
Entropy ;
Hierarchy ;
d3 ;
Summarization ;
Visualization ;
Trees
|
Abstract:
|
We present a method for visualizing and summarizing large, tree-structured data. Many data sets can be represented by a rooted, node-weighted tree, such as a company organizational chart, clicks on webpages, flows to and from IP addresses, or hard disk file structures, for example, where the weights represent some attribute of interest for each node. If such a tree has thousands (or millions) of nodes, it is difficult to visualize on a single sheet or paper or computer screen. We define a way to aggregate the weights of a large, n-node tree into a smaller k-node "summary tree" (where k is something like 50 or 100), and we present a dynamic programming algorithm to compute the summary tree with maximum entropy among all summary trees of a given size, where the entropy of a node-weighted tree is defined as the entropy of the discrete probability distribution whose probabilities are the normalized node weights. We discuss and provide examples of how this algorithm produces useful visualizations, and may also be optimal for certain kinds of data analysis tasks.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2013 program
|
2013 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.