Online Program Home
My Program

Abstract Details

Activity Number: 532 - Can Statistics Inform Decisions in Social, Economic, and Political Event?
Type: Contributed
Date/Time: Wednesday, August 1, 2018 : 10:30 AM to 12:20 PM
Sponsor: Social Statistics Section
Abstract #327267
Title: Recommender System Approaches for Data Quality and Data Validation
Author(s): Anne Parker* and William Roberts and Danielle Gewurz
Companies: Internal Revenue Service and Deloitte and Deloitte
Keywords: recommender system; tax compliance; data validation
Abstract:

Recommendation systems are a family of unsupervised machine learning approaches widely used in commercial industry that can estimate either non-existent or missing values as well as identify outliers and anomalies. This approach is both trained on and applied to the very same set of data. This obviates the need for the degree of curated training data necessary for predictive modeling in a supervised context. In this paper, we describe the application of one such approach in which a collaborative filtering model is trained to identify population parameters using sparse data to identify anomalous values among a populations of millions of observations across a range of data fields. Estimates are produced for each field based on the entire population for that field as well as the other fields associated with that observation. Anomalies in each observation are detected by estimating the expected value of each data field and subsequently comparing those estimates against observed values. After preliminary testing, the performance of the collaborative filtering model improves upon current methods of identifying anomalies within the IRS.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program