Online Program Home
  My Program

Abstract Details

Activity Number: 670 - Statistical Computing in Data Science
Type: Contributed
Date/Time: Thursday, August 3, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract #322450 View Presentation
Title: Automated Detection of Data Quality Anomalies in 3rd Party Administrative Data
Author(s): Thomas George*
Companies: Vidoori
Keywords: Data ; Data Quality ; Data Subscription ; 3rd Party Data ; Third Party Data
Abstract:

Utilizing data collected from a 3rd party makes all subsequent observations dependent upon the data quality anomalies that may be present from that 3rd party data. With the advent of large, purchasable, datasets utilized to make informed statistical decisions, it becomes paramount to understand how to detect and quantify the extent of anomalous data, before analysis can begin. The most powerful methodology for detecting and correcting data quality anomalies within a data set requires knowledge of the 3rd party's business practices, which is often hard to acquire. Therefore, global and widely applicable methods for determining for anomaly quantification within a given data set become more important. This paper explores the use of quantifying data quality anomalies through use of statistical analysis checks applied to different types of data and presents a metric for evaluating purchased administrative data


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association