Name: 2018 Joint Statistical Meetings
Start: 2018-07-28T07:00:00+00:00
End: 2018-08-02
Location: Vancouver Convention Centre

Activity Number:	389 - Improving Survey Data Quality with Machine Learning Techniques
Type:	Invited
Date/Time:	Tuesday, July 31, 2018 : 2:00 PM to 3:50 PM
Sponsor:	Survey Research Methods Section
Abstract #326578	Presentation
Title:	A Comparison of Automatic Algorithms for Occupation Coding
Author(s):	Malte Schierholz*
Companies:	Institute for Employment Research
Keywords:	occupation coding; coding index; supervised learning; method comparison
Abstract:	Occupation coding refers to the assignment of respondents' textual answers from surveys into an official occupational classification. It is time-consuming and expensive if done manually and, as a remedy, several algorithms have been suggested to automate this process. To overcome deficiencies of existent techniques and to provide probabilistic predictions, we introduce yet another method that combines training data from previous studies and job titles from a coding index. Using data from various German surveys, we compare our new method with some of the main algorithms described in the literature, including regularized logistic regression, gradient boosting, nearest neighbors, memory-based reasoning, and string similarity. Strengths and weaknesses of each algorithm are discussed.

Authors who are presenting talks have a * after their name.

JSM 2018 Online Program