Online Program Home
My Program

Abstract Details

Activity Number: 139 - Competing Effectively: Hosting, Designing, and Participating in Kaggle-Style Competitions
Type: Invited
Date/Time: Monday, July 30, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Defense and National Security
Abstract #326650 Presentation
Title: Effective Data Competition Hosting: Strategic Design and Analysis to Maximize Learning
Author(s): Christine M Anderson-Cook* and Kary Myers
Companies: Los Alamos National Laboratory and Los Alamos National Laboratory
Keywords: Kaggle; Crowdsourcing; Design of Experiments; Algorithm Assessment

Leveraging the depth and breadth of solutions possible from crowdsourcing can be a powerful accelerator to method development for high consequence problems. While data science competitions have become quite popular and prevalent, their implementations are highly variable and can sometimes lead to solutions not closely matched to solving the real problem of interest. This talk outlines considerations when hosting a competition, such as (1) defining the precise problem, (2) specifying data sets to include testing for interpolation and extrapolation to new scenarios and to focus on regions of the inputs of maximum interest, (3) determining a robust and relevant scoring metric that appropriately orders the competitors to match study goals, and (4) developing an analysis to provide summaries beyond just a winner on the leaderboard.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program