Online Program Home
My Program

Abstract Details

Activity Number: 50 - Which Sessions Should This Go To? Text Analytics to the Rescue of Conference Committees
Type: Invited
Date/Time: Sunday, July 29, 2018 : 4:00 PM to 5:50 PM
Sponsor: Section on Statistical Computing
Abstract #326705
Title: Identifying and Utilizing Research Topics in Conference Abstracts
Author(s): Stas Kolenikov* and Alison Thaung
Companies: Abt Associates and Abt Associates
Keywords: text analysis; hierarchical clustering; classification; professional service

Combining conference abstracts into coherent sessions is one of the most burdensome tasks of an ASA section program chair. Pan, Zou and Yu applied an unsupervised learning to find abstracts that are most similar to each other in the TF-IDF space. I am extending this approach by adding a semi-supervised component in the form of a research classification scheme, a hierarchical structure with broad topics as the first character of the classification code, and more detailed topics represented in the subsequent characters. The hierarchical structure of the classification code provides an implicit distance metric for abstracts that share (parts of) their classification codes: abstracts that share the full classification code are deemed closest to one another, while those that only share the stem are more distant from each other. In this approach, instead of analyzing the individual terms or bigrams, we first classify each abstract within the classification scheme, and then combine the abstracts together based on the distance in the research topic hierarchy. The process is demonstrated with the historic abstracts of the Survey Research Methods Section, and applied to the 2018 submissions.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program