615 – Model-Assisted Estimation - 1
Analyzing Open-Ended Survey Questions Using Unsupervised Learning Methods
Fang Wang
NORC at the University of Chicago
Edward Mulrow
NORC at the University of Chicago
Unsupervised learning methods such as topic modeling or k-mean clustering can provide techniques for organizing, understanding and summarizing text data without using any manually labeled records as training data. It uses annotations to organize text and discover latent themes in documents without target attributes. We explore using unsupervised learning to classify open-ended survey question responses. By grouping similar responses together, we construct a class of "topics" and reduce the exploration of open ended text information to common categorical analysis. We present topic modeling and k-mean clustering examples using different survey data. The resulting topic categories are described by sets of keywords.