Online Program Home
  My Program

Abstract Details

Activity Number: 321 - Modern Statistical Learning for Ranking and Crowdsourcing
Type: Topic Contributed
Date/Time: Tuesday, August 1, 2017 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #322826
Title: A Permutation-Based Model for Crowdsourcing: Optimal Estimation and Robustness
Author(s): Nihar B Shah* and Sivaraman Balakrishnan and Martin J. Wainwright
Companies: Univ of California - Berkeley and Department of Statistics, CMU and EECS and Statistics, University of California, Berkeley
Keywords: high dimensional statistics ; crowd labeling ; classification
Abstract:

The aggregation and denoising of crowd-labeled data is a task that has gained increased significance with the advent of crowdsourcing platforms and requirements of massive labeled datasets. In this paper, we propose a permutation-based model for crowd-labeled data that is a significant generalization of the popular "Dawid-Skene" model. Working in a high-dimensional non-asymptotic framework, we derive optimal rates of convergence for the permutation-based model. We show that the permutation-based model offers significant robustness in estimation due to its richness, while surprisingly incurring only a small statistical penalty as compared to the Dawid-Skene model. Finally, we propose a polynomial-time computable algorithm, called OBI-WAN, for provably efficient estimation under these models.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association