Abstract:
|
Leveraging the depth and breadth of solutions possible from crowdsourcing can be a powerful accelerator to method development for high consequence problems. While data science competitions have become quite popular and prevalent, their implementations are highly variable and can sometimes lead to solutions not closely matched to solving the real problem of interest. This talk outlines considerations when hosting a competition, such as (1) defining the precise problem, (2) specifying data sets to include testing for interpolation and extrapolation to new scenarios and to focus on regions of the inputs of maximum interest, (3) determining a robust and relevant scoring metric that appropriately orders the competitors to match study goals, and (4) developing an analysis to provide summaries beyond just a winner on the leaderboard.
|