Online Program Home
  My Program

Abstract Details

Activity Number: 548 - The Evolution and Future Direction of Statistical Computing and Visualization
Type: Invited
Date/Time: Wednesday, August 2, 2017 : 2:00 PM to 3:50 PM
Sponsor: Section on Statistical Computing
Abstract #325028
Title: Practical Data Analysis: An Algorithmic Approach
Author(s): Leland Wilkinson*
Companies: H2O.ai
Keywords:
Abstract:

Theory and practice do not always coincide in the world of real data analysis. This paper presents a new practical algorithm, called hdoutliers, for detecting multidimensional outliers. It is designed specifically to a) deal with a mixture of categorical and continuous variables, b) deal with the curse of dimensionality (many columns of data), c) deal with many rows of data, d) deal with outliers that mask other outliers, and e) deal consistently with uni- dimensional and multidimensional problems. Unlike ad hoc methods found in many machine learning papers, hdoutliers is based on a distributional model that allows outliers to be tagged with a probability. And unlike many methods found in the statistical literature, it presents opportunities for extending the problem to messy datasets.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association