Online Program Home
My Program

Abstract Details

Activity Number: 389 - Improving Survey Data Quality with Machine Learning Techniques
Type: Invited
Date/Time: Tuesday, July 31, 2018 : 2:00 PM to 3:50 PM
Sponsor: Survey Research Methods Section
Abstract #326578 Presentation
Title: A Comparison of Automatic Algorithms for Occupation Coding
Author(s): Malte Schierholz*
Companies: Institute for Employment Research
Keywords: occupation coding; coding index; supervised learning; method comparison

Occupation coding refers to the assignment of respondents' textual answers from surveys into an official occupational classification. It is time-consuming and expensive if done manually and, as a remedy, several algorithms have been suggested to automate this process. To overcome deficiencies of existent techniques and to provide probabilistic predictions, we introduce yet another method that combines training data from previous studies and job titles from a coding index. Using data from various German surveys, we compare our new method with some of the main algorithms described in the literature, including regularized logistic regression, gradient boosting, nearest neighbors, memory-based reasoning, and string similarity. Strengths and weaknesses of each algorithm are discussed.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program